mirror of
https://github.com/aljazceru/goose.git
synced 2025-12-18 22:54:24 +01:00
docs: local LLMs context size tip (#3454)
Signed-off-by: jjjuk <gmodhl67@gmail.com> Co-authored-by: angiejones <jones.angie@gmail.com>
This commit is contained in:
@@ -306,7 +306,7 @@ Ollama and Ramalama are both options to provide local LLMs, each which requires
|
|||||||
2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools):
|
2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools):
|
||||||
|
|
||||||
:::warning Limited Support for models without tool calling
|
:::warning Limited Support for models without tool calling
|
||||||
Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose.
|
Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions).
|
||||||
:::
|
:::
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
@@ -397,20 +397,24 @@ If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST=
|
|||||||
└ Configuration saved successfully
|
└ Configuration saved successfully
|
||||||
```
|
```
|
||||||
|
|
||||||
|
:::tip Context Length
|
||||||
|
If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size).
|
||||||
|
:::
|
||||||
|
|
||||||
#### Ramalama
|
#### Ramalama
|
||||||
|
|
||||||
1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install).
|
1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install).
|
||||||
2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) :
|
2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) :
|
||||||
|
|
||||||
:::warning Limited Support for models without tool calling
|
:::warning Limited Support for models without tool calling
|
||||||
Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose.
|
Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions).
|
||||||
:::
|
:::
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
# NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider.
|
# NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider.
|
||||||
ramalama serve --runtime-args="--jinja" ollama://qwen2.5
|
ramalama serve --runtime-args="--jinja" --ctx-size=8192 ollama://qwen2.5
|
||||||
```
|
```
|
||||||
|
|
||||||
3. In a separate terminal window, configure with Goose:
|
3. In a separate terminal window, configure with Goose:
|
||||||
@@ -493,6 +497,11 @@ For the Ollama provider, if you don't provide a host, we set it to `localhost:11
|
|||||||
└ Configuration saved successfully
|
└ Configuration saved successfully
|
||||||
```
|
```
|
||||||
|
|
||||||
|
:::tip Context Length
|
||||||
|
If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c).
|
||||||
|
:::
|
||||||
|
|
||||||
|
|
||||||
### DeepSeek-R1
|
### DeepSeek-R1
|
||||||
|
|
||||||
Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally.
|
Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally.
|
||||||
|
|||||||
Reference in New Issue
Block a user