From 521ad3ebb0167d6f5ab600bc45013b76a315fe1c Mon Sep 17 00:00:00 2001 From: Michael Neale Date: Tue, 13 May 2025 04:27:53 +1000 Subject: [PATCH] short post on qwen3 (#2508) --- .../2025-05-12-local-goose-qwen3/index.md | 69 +++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 documentation/blog/2025-05-12-local-goose-qwen3/index.md diff --git a/documentation/blog/2025-05-12-local-goose-qwen3/index.md b/documentation/blog/2025-05-12-local-goose-qwen3/index.md new file mode 100644 index 00000000..15269958 --- /dev/null +++ b/documentation/blog/2025-05-12-local-goose-qwen3/index.md @@ -0,0 +1,69 @@ +--- +title: "Goose and Qwen3 for local execution" +description: "Qwen3 and tool calling with goose, an example local workflow" +authors: + - mic +--- + + +A couple of weeks back Qwen 3 (https://qwenlm.github.io/blog/qwen3/) launched with a raft of capabilities and sizes. + +This model showed promise and even in very compact form, such as 8B parameters and 4bit quantisation, was able to do tool calling successfully with goose. +Even multi turn tool calling. + +I haven't seen this work at such a scaled down model so far, so this is really impressive and bodes well for both this model, but also future open weight models both large and small. +I would expect the Qwen3 larger models work quite well on various tasks but even this small one I found useful. + +# Local workflows and local agents + +For some time I have had a little helper function in my `~/.zshrc` file for command line usage: + +```zsh +# zsh helper to use goose if you make a typo or just want to yolo into the shell +command_not_found_handler() { + local cmd="$*" + echo "🪿:" + goose run -t "can you try to run this command please: $cmd" +} +``` + +This makes use of a zsh feature (zsh now being standard on macos) that will delegate to that function if nothing else on the command line makes sense. +This lets me either make typos or just type in what I want in the command line such as `$> can you kill whatever is listening on port 8000` and goose will do the work, don't even need to open a goose session. + +With Qwen3 + ollama running all locally with goose, it worked well enough I switched over to a complete local version of that workflow: + +```zsh +command_not_found_handler() { + local cmd="$*" + echo "🪿:" + GOOSE_PROVIDER=ollama GOOSE_MODEL=michaelneale/qwen3 goose run -t "can you try to run this command please: $cmd" +} +``` + +which works when I am offline, on the train etc. + +# Qwen3 reasoning + +By default Qwen 3 models will "think" (reason) about the problem, as they are general purpose models, but I found it was quicker (and worked better for my purpose) to make it go into this reason stage. +By adding `/no_think` to the system prompt, it will general skip to the execution (this may make it less successful at larger tasks but this is a small model for just a few turns of tool calls in this case). + +I made a small tweak to the default Ollama chat template here: https://ollama.com/michaelneale/qwen3 which you can use as above that you can use as above, if you like (or the default `qwen3` model hosted by ollama also works fine out of the box) + +## advanced tips + +You can use the goose `/plan` mode with a seperate model (perhaps Qwen3 with reasoning, or another model such as deepseek) to help plan actions before shifting to Qwen3 for the execution (via tool calls). + +It would be interesting to try the larger models if, you have access to hardware (I have only used the 8B parameter one). +My current setup is a 64G M1 pro macbook (circa 2022 hardware) which has probably less than 48G available to use for GPUs/AI, which puts a limit on what I can run, but qwen3 with "no think" mode works acceptably for my purposes. + + + + + + + + + + + +