docs: model context limit overrides (#3377)

2025-12-17 22:24:21 +01:00 · 2025-07-14 20:47:52 -07:00
parent fc3d59f6f9
commit 7d48d7dd34
2 changed files with 82 additions and 1 deletions
--- a/documentation/docs/guides/environment-variables.md
+++ b/documentation/docs/guides/environment-variables.md
@@ -131,7 +131,7 @@ export GOOSE_MAX_TURNS=100
 export GOOSE_CLI_THEME=ansi
 ```

-### Context Limit Configuration
+### Model Context Limit Overrides

 These variables allow you to override the default context window size (token limit) for your models. This is particularly useful when using [LiteLLM proxies](https://docs.litellm.ai/docs/providers/litellm_proxy) or custom models that don't match Goose's predefined model patterns.

@@ -156,6 +156,8 @@ export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution
 export GOOSE_PLANNER_CONTEXT_LIMIT=1000000
 ```

+For more details and examples, see [Model Context Limit Overrides](/docs/guides/smart-context-management#model-context-limit-overrides).
+
 ## Tool Configuration

 These variables control how Goose handles [tool permissions](/docs/guides/managing-tools/tool-permissions) and their execution.
--- a/documentation/docs/guides/smart-context-management.md
+++ b/documentation/docs/guides/smart-context-management.md
@@ -264,6 +264,85 @@ After sending your first message, Goose Desktop and Goose CLI display token usag
    </TabItem>
 </Tabs>

+## Model Context Limit Overrides
+
+Context limits are automatically detected based on your model name, but Goose provides settings to override the default limits:
+
+| Model | Description | Best For | Setting |
+|-------|-------------|----------|---------|
+| **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` |
+| **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` |
+| **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` |
+| **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` |
+
+:::info
+This setting only affects the displayed token usage and progress indicators. Actual context management is handled by your LLM, so you may experience more or less usage than the limit you set, regardless of what the display shows.
+:::
+
+This feature is particularly useful with:
+
+- **LiteLLM Proxy Models**: When using LiteLLM with custom model names that don't match Goose's patterns
+- **Enterprise Deployments**: Custom model deployments with non-standard naming  
+- **Fine-tuned Models**: Custom models with different context limits than their base versions
+- **Development/Testing**: Temporarily adjusting context limits for testing purposes
+
+Goose resolves context limits with the following precedence (highest to lowest):
+
+1. Explicit context_limit in model configuration (if set programmatically)
+2. Specific environment variable (e.g., `GOOSE_LEAD_CONTEXT_LIMIT`)
+3. Global environment variable (`GOOSE_CONTEXT_LIMIT`)
+4. Model-specific default based on name pattern matching
+5. Global default (128,000 tokens)
+
+**Configuration**
+
+<Tabs groupId="interface">
+  <TabItem value="ui" label="Goose Desktop" default>
+
+     Model context limit overrides are not yet available in the Goose Desktop app.
+
+  </TabItem>
+  <TabItem value="cli" label="Goose CLI">
+
+    Context limit overrides only work as [environment variables](/docs/guides/environment-variables#model-context-limit-overrides), not in the config file.
+
+    ```bash
+    export GOOSE_CONTEXT_LIMIT=1000
+    goose session
+    ```
+
+  </TabItem>
+    
+</Tabs>
+
+**Scenarios**
+
+1. LiteLLM proxy with custom model name
+
+```bash
+# LiteLLM proxy with custom model name
+export GOOSE_PROVIDER="openai"
+export GOOSE_MODEL="my-custom-gpt4-proxy"
+export GOOSE_CONTEXT_LIMIT=200000  # Override the 32k default
+```
+
+2. Lead/worker setup with different context limits
+
+```bash
+# Different context limits for planning vs execution
+export GOOSE_LEAD_MODEL="claude-opus-custom"
+export GOOSE_LEAD_CONTEXT_LIMIT=500000    # Large context for planning
+export GOOSE_WORKER_CONTEXT_LIMIT=128000  # Smaller context for execution
+```
+
+3. Planner with large context
+
+```bash
+# Large context for complex planning
+export GOOSE_PLANNER_MODEL="gpt-4-custom"
+export GOOSE_PLANNER_CONTEXT_LIMIT=1000000
+```
+
 ## Cost Tracking
 Display estimated real-time costs of your session at the bottom of the Goose Desktop window.