init push

This commit is contained in:
zachary62
2025-04-04 13:01:50 -04:00
parent 97c20e803a
commit e62ee2cb13
162 changed files with 42423 additions and 11 deletions

View File

@@ -0,0 +1,281 @@
# Chapter 1: Agent - The Workers of AutoGen
Welcome to the AutoGen Core tutorial! We're excited to guide you through building powerful applications with autonomous agents.
## Motivation: Why Do We Need Agents?
Imagine you want to build an automated system to write blog posts. You might need one part of the system to research a topic and another part to write the actual post based on the research. How do you represent these different "workers" and make them talk to each other?
This is where the concept of an **Agent** comes in. In AutoGen Core, an `Agent` is the fundamental building block representing an actor or worker in your system. Think of it like an employee in an office.
## Key Concepts: Understanding Agents
Let's break down what makes an Agent:
1. **It's a Worker:** An Agent is designed to *do* things. This could be running calculations, calling a Large Language Model (LLM) like ChatGPT, using a tool (like a search engine), or managing a piece of data.
2. **It Has an Identity (`AgentId`):** Just like every employee has a name and a job title, every Agent needs a unique identity. This identity, called `AgentId`, has two parts:
* `type`: What kind of role does the agent have? (e.g., "researcher", "writer", "coder"). This helps organize agents.
* `key`: A unique name for this specific agent instance (e.g., "researcher-01", "amy-the-writer").
```python
# From: _agent_id.py
class AgentId:
def __init__(self, type: str, key: str) -> None:
# ... (validation checks omitted for brevity)
self._type = type
self._key = key
@property
def type(self) -> str:
return self._type
@property
def key(self) -> str:
return self._key
def __str__(self) -> str:
# Creates an id like "researcher/amy-the-writer"
return f"{self._type}/{self._key}"
```
This `AgentId` acts like the agent's address, allowing other agents (or the system) to send messages specifically to it.
3. **It Has Metadata (`AgentMetadata`):** Besides its core identity, an agent often has descriptive information.
* `type`: Same as in `AgentId`.
* `key`: Same as in `AgentId`.
* `description`: A human-readable explanation of what the agent does (e.g., "Researches topics using web search").
```python
# From: _agent_metadata.py
from typing import TypedDict
class AgentMetadata(TypedDict):
type: str
key: str
description: str
```
This metadata helps understand the agent's purpose within the system.
4. **It Communicates via Messages:** Agents don't work in isolation. They collaborate by sending and receiving messages. The primary way an agent receives work is through its `on_message` method. Think of this like the agent's inbox.
```python
# From: _agent.py (Simplified Agent Protocol)
from typing import Any, Mapping, Protocol
# ... other imports
class Agent(Protocol):
@property
def id(self) -> AgentId: ... # The agent's unique ID
async def on_message(self, message: Any, ctx: MessageContext) -> Any:
"""Handles an incoming message."""
# Agent's logic to process the message goes here
...
```
When an agent receives a message, `on_message` is called. The `message` contains the data or task, and `ctx` (MessageContext) provides extra information about the message (like who sent it). We'll cover `MessageContext` more later.
5. **It Can Remember Things (State):** Sometimes, an agent needs to remember information between tasks, like keeping notes on research progress. Agents can optionally implement `save_state` and `load_state` methods to store and retrieve their internal memory.
```python
# From: _agent.py (Simplified Agent Protocol)
class Agent(Protocol):
# ... other methods
async def save_state(self) -> Mapping[str, Any]:
"""Save the agent's internal memory."""
# Return a dictionary representing the state
...
async def load_state(self, state: Mapping[str, Any]) -> None:
"""Load the agent's internal memory."""
# Restore state from the dictionary
...
```
We'll explore state and memory in more detail in [Chapter 7: Memory](07_memory.md).
6. **Different Agent Types:** AutoGen Core provides base classes to make creating agents easier:
* `BaseAgent`: The fundamental class most agents inherit from. It provides common setup.
* `ClosureAgent`: A very quick way to create simple agents using just a function (like hiring a temp worker for a specific task defined on the spot).
* `RoutedAgent`: An agent that can automatically direct different types of messages to different internal handler methods (like a smart receptionist).
## Use Case Example: Researcher and Writer
Let's revisit our blog post example. We want a `Researcher` agent and a `Writer` agent.
**Goal:**
1. Tell the `Researcher` a topic (e.g., "AutoGen Agents").
2. The `Researcher` finds some facts (we'll keep it simple and just make them up for now).
3. The `Researcher` sends these facts to the `Writer`.
4. The `Writer` receives the facts and drafts a short post.
**Simplified Implementation Idea (using `ClosureAgent` for brevity):**
First, let's define the messages they might exchange:
```python
from dataclasses import dataclass
@dataclass
class ResearchTopic:
topic: str
@dataclass
class ResearchFacts:
topic: str
facts: list[str]
@dataclass
class DraftPost:
topic: str
draft: str
```
These are simple Python classes to hold the data being passed around.
Now, let's imagine defining the `Researcher` using a `ClosureAgent`. This agent will listen for `ResearchTopic` messages.
```python
# Simplified concept - requires AgentRuntime (Chapter 3) to actually run
async def researcher_logic(agent_context, message: ResearchTopic, msg_context):
print(f"Researcher received topic: {message.topic}")
# In a real scenario, this would involve searching, calling an LLM, etc.
# For now, we just make up facts.
facts = [f"Fact 1 about {message.topic}", f"Fact 2 about {message.topic}"]
print(f"Researcher found facts: {facts}")
# Find the Writer agent's ID (we assume we know it)
writer_id = AgentId(type="writer", key="blog_writer_1")
# Send the facts to the Writer
await agent_context.send_message(
message=ResearchFacts(topic=message.topic, facts=facts),
recipient=writer_id,
)
print("Researcher sent facts to Writer.")
# This agent doesn't return a direct reply
return None
```
This `researcher_logic` function defines *what* the researcher does when it gets a `ResearchTopic` message. It processes the topic, creates `ResearchFacts`, and uses `agent_context.send_message` to send them to the `writer` agent.
Similarly, the `Writer` agent would have its own logic:
```python
# Simplified concept - requires AgentRuntime (Chapter 3) to actually run
async def writer_logic(agent_context, message: ResearchFacts, msg_context):
print(f"Writer received facts for topic: {message.topic}")
# In a real scenario, this would involve LLM prompting
draft = f"Blog Post about {message.topic}:\n"
for fact in message.facts:
draft += f"- {fact}\n"
print(f"Writer drafted post:\n{draft}")
# Perhaps save the draft or send it somewhere else
# For now, we just print it. We don't send another message.
return None # Or maybe return a confirmation/result
```
This `writer_logic` function defines how the writer reacts to receiving `ResearchFacts`.
**Important:** To actually *run* these agents and make them communicate, we need the `AgentRuntime` (covered in [Chapter 3: AgentRuntime](03_agentruntime.md)) and the `Messaging System` (covered in [Chapter 2: Messaging System](02_messaging_system__topic___subscription_.md)). For now, focus on the *idea* that Agents are distinct workers defined by their logic (`on_message`) and identified by their `AgentId`.
## Under the Hood: How an Agent Gets a Message
While the full message delivery involves the `Messaging System` and `AgentRuntime`, let's look at the agent's role when it receives a message.
**Conceptual Flow:**
```mermaid
sequenceDiagram
participant Sender as Sender Agent
participant Runtime as AgentRuntime
participant Recipient as Recipient Agent
Sender->>+Runtime: send_message(message, recipient_id)
Runtime->>+Recipient: Locate agent by recipient_id
Runtime->>+Recipient: on_message(message, context)
Recipient->>Recipient: Process message using internal logic
alt Response Needed
Recipient->>-Runtime: Return response value
Runtime->>-Sender: Deliver response value
else No Response
Recipient->>-Runtime: Return None (or no return)
end
```
1. Some other agent (Sender) or the system decides to send a message to our agent (Recipient).
2. It tells the `AgentRuntime` (the manager): "Deliver this `message` to the agent with `recipient_id`".
3. The `AgentRuntime` finds the correct `Recipient` agent instance.
4. The `AgentRuntime` calls the `Recipient.on_message(message, context)` method.
5. The agent's internal logic inside `on_message` (or methods called by it, like in `RoutedAgent`) runs to process the message.
6. If the message requires a direct response (like an RPC call), the agent returns a value from `on_message`. If not (like a general notification or event), it might return `None`.
**Code Glimpse:**
The core definition is the `Agent` Protocol (`_agent.py`). It's like an interface or a contract any class wanting to be an Agent *must* provide these methods.
```python
# From: _agent.py - The Agent blueprint (Protocol)
@runtime_checkable
class Agent(Protocol):
@property
def metadata(self) -> AgentMetadata: ...
@property
def id(self) -> AgentId: ...
async def on_message(self, message: Any, ctx: MessageContext) -> Any: ...
async def save_state(self) -> Mapping[str, Any]: ...
async def load_state(self, state: Mapping[str, Any]) -> None: ...
async def close(self) -> None: ...
```
Most agents you create will inherit from `BaseAgent` (`_base_agent.py`). It provides some standard setup:
```python
# From: _base_agent.py (Simplified)
class BaseAgent(ABC, Agent):
def __init__(self, description: str) -> None:
# Gets runtime & id from a special context when created by the runtime
# Raises error if you try to create it directly!
self._runtime: AgentRuntime = AgentInstantiationContext.current_runtime()
self._id: AgentId = AgentInstantiationContext.current_agent_id()
self._description = description
# ...
# This is the final version called by the runtime
@final
async def on_message(self, message: Any, ctx: MessageContext) -> Any:
# It calls the implementation method you need to write
return await self.on_message_impl(message, ctx)
# You MUST implement this in your subclass
@abstractmethod
async def on_message_impl(self, message: Any, ctx: MessageContext) -> Any: ...
# Helper to send messages easily
async def send_message(self, message: Any, recipient: AgentId, ...) -> Any:
# It just asks the runtime to do the actual sending
return await self._runtime.send_message(
message, sender=self.id, recipient=recipient, ...
)
# ... other methods like publish_message, save_state, load_state
```
Notice how `BaseAgent` handles getting its `id` and `runtime` during creation and provides a convenient `send_message` method that uses the runtime. When inheriting from `BaseAgent`, you primarily focus on implementing the `on_message_impl` method to define your agent's unique behavior.
## Next Steps
You now understand the core concept of an `Agent` in AutoGen Core! It's the fundamental worker unit with an identity, the ability to process messages, and optionally maintain state.
In the next chapters, we'll explore:
* [Chapter 2: Messaging System](02_messaging_system__topic___subscription_.md): How messages actually travel between agents.
* [Chapter 3: AgentRuntime](03_agentruntime.md): The manager responsible for creating, running, and connecting agents.
Let's continue building your understanding!
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,267 @@
# Chapter 2: Messaging System (Topic & Subscription)
In [Chapter 1: Agent](01_agent.md), we learned about Agents as individual workers. But how do they coordinate when one agent doesn't know exactly *who* needs the information it produces? Imagine our Researcher finds some facts. Maybe the Writer needs them, but maybe a Fact-Checker agent or a Summary agent also needs them later. How can the Researcher just announce "Here are the facts!" without needing a specific mailing list?
This is where the **Messaging System**, specifically **Topics** and **Subscriptions**, comes in. It allows agents to broadcast messages to anyone interested, like posting on a company announcement board.
## Motivation: Broadcasting Information
Let's refine our blog post example:
1. The `Researcher` agent finds facts about "AutoGen Agents".
2. Instead of sending *directly* to the `Writer`, the `Researcher` **publishes** these facts to a general "research-results" **Topic**.
3. The `Writer` agent has previously told the system it's **subscribed** to the "research-results" Topic.
4. The system sees the new message on the Topic and delivers it to the `Writer` (and any other subscribers).
This way, the `Researcher` doesn't need to know who the `Writer` is, or even if a `Writer` exists! It just broadcasts the results. If we later add a `FactChecker` agent that also needs the results, it simply subscribes to the same Topic.
## Key Concepts: Topics and Subscriptions
Let's break down the components of this broadcasting system:
1. **Topic (`TopicId`): The Announcement Board**
* A `TopicId` represents a specific channel or category for messages. Think of it like the name of an announcement board (e.g., "Project Updates", "General Announcements").
* It has two main parts:
* `type`: What *kind* of event or information is this? (e.g., "research.completed", "user.request"). This helps categorize messages.
* `source`: *Where* or *why* did this event originate? Often, this relates to the specific task or context (e.g., the specific blog post being researched like "autogen-agents-blog-post", or the team generating the event like "research-team").
```python
# From: _topic.py (Simplified)
from dataclasses import dataclass
@dataclass(frozen=True) # Immutable: can't change after creation
class TopicId:
type: str
source: str
def __str__(self) -> str:
# Creates an id like "research.completed/autogen-agents-blog-post"
return f"{self.type}/{self.source}"
```
This structure allows for flexible filtering. Agents might subscribe to all topics of a certain `type`, regardless of the `source`, or only to topics with a specific `source`.
2. **Publishing: Posting the Announcement**
* When an agent has information to share broadly, it *publishes* a message to a specific `TopicId`.
* This is like pinning a note to the designated announcement board. The agent doesn't need to know who will read it.
3. **Subscription (`Subscription`): Signing Up for Updates**
* A `Subscription` is how an agent declares its interest in certain `TopicId`s.
* It acts like a rule: "If a message is published to a Topic that matches *this pattern*, please deliver it to *this kind of agent*".
* The `Subscription` links a `TopicId` pattern (e.g., "all topics with type `research.completed`") to an `AgentId` (or a way to determine the `AgentId`).
4. **Routing: Delivering the Mail**
* The `AgentRuntime` (the system manager we'll meet in [Chapter 3: AgentRuntime](03_agentruntime.md)) keeps track of all active `Subscription`s.
* When a message is published to a `TopicId`, the `AgentRuntime` checks which `Subscription`s match that `TopicId`.
* For each match, it uses the `Subscription`'s rule to figure out which specific `AgentId` should receive the message and delivers it.
## Use Case Example: Researcher Publishes, Writer Subscribes
Let's see how our Researcher and Writer can use this system.
**Goal:** Researcher publishes facts to a topic, Writer receives them via subscription.
**1. Define the Topic:**
We need a `TopicId` for research results. Let's say the `type` is "research.facts.available" and the `source` identifies the specific research task (e.g., "blog-post-autogen").
```python
# From: _topic.py
from autogen_core import TopicId
# Define the topic for this specific research task
research_topic_id = TopicId(type="research.facts.available", source="blog-post-autogen")
print(f"Topic ID: {research_topic_id}")
# Output: Topic ID: research.facts.available/blog-post-autogen
```
This defines the "announcement board" we'll use.
**2. Researcher Publishes:**
The `Researcher` agent, after finding facts, will use its `agent_context` (provided by the runtime) to publish the `ResearchFacts` message to this topic.
```python
# Simplified concept - Researcher agent logic
# Assume 'agent_context' and 'message' (ResearchTopic) are provided
# Define the facts message (from Chapter 1)
@dataclass
class ResearchFacts:
topic: str
facts: list[str]
async def researcher_publish_logic(agent_context, message: ResearchTopic, msg_context):
print(f"Researcher working on: {message.topic}")
facts_data = ResearchFacts(
topic=message.topic,
facts=[f"Fact A about {message.topic}", f"Fact B about {message.topic}"]
)
# Define the specific topic for this task's results
results_topic = TopicId(type="research.facts.available", source=message.topic) # Use message topic as source
# Publish the facts to the topic
await agent_context.publish_message(message=facts_data, topic_id=results_topic)
print(f"Researcher published facts to topic: {results_topic}")
# No direct reply needed
return None
```
Notice the `agent_context.publish_message` call. The Researcher doesn't specify a recipient, only the topic.
**3. Writer Subscribes:**
The `Writer` agent needs to tell the system it's interested in messages on topics like "research.facts.available". We can use a predefined `Subscription` type called `TypeSubscription`. This subscription typically means: "I am interested in all topics with this *exact type*. When a message arrives, create/use an agent of *my type* whose `key` matches the topic's `source`."
```python
# From: _type_subscription.py (Simplified Concept)
from autogen_core import TypeSubscription, BaseAgent
class WriterAgent(BaseAgent):
# ... agent implementation ...
async def on_message_impl(self, message: ResearchFacts, ctx):
# This method gets called when a subscribed message arrives
print(f"Writer ({self.id}) received facts via subscription: {message.facts}")
# ... process facts and write draft ...
# How the Writer subscribes (usually done during runtime setup - Chapter 3)
# This tells the runtime: "Messages on topics with type 'research.facts.available'
# should go to a 'writer' agent whose key matches the topic source."
writer_subscription = TypeSubscription(
topic_type="research.facts.available",
agent_type="writer" # The type of agent that should handle this
)
print(f"Writer subscription created for topic type: {writer_subscription.topic_type}")
# Output: Writer subscription created for topic type: research.facts.available
```
When the `Researcher` publishes to `TopicId(type="research.facts.available", source="blog-post-autogen")`, the `AgentRuntime` will see that `writer_subscription` matches the `topic_type`. It will then use the rule: "Find (or create) an agent with `AgentId(type='writer', key='blog-post-autogen')` and deliver the message."
**Benefit:** Decoupling! The Researcher just broadcasts. The Writer just listens for relevant broadcasts. We can add more listeners (like a `FactChecker` subscribing to the same `topic_type`) without changing the `Researcher` at all.
## Under the Hood: How Publishing Works
Let's trace the journey of a published message.
**Conceptual Flow:**
```mermaid
sequenceDiagram
participant Publisher as Publisher Agent
participant Runtime as AgentRuntime
participant SubRegistry as Subscription Registry
participant Subscriber as Subscriber Agent
Publisher->>+Runtime: publish_message(message, topic_id)
Runtime->>+SubRegistry: Find subscriptions matching topic_id
SubRegistry-->>-Runtime: Return list of matching Subscriptions
loop For each matching Subscription
Runtime->>Subscription: map_to_agent(topic_id)
Subscription-->>Runtime: Return target AgentId
Runtime->>+Subscriber: Locate/Create Agent instance by AgentId
Runtime->>Subscriber: on_message(message, context)
Subscriber-->>-Runtime: Process message (optional return)
end
Runtime-->>-Publisher: Return (usually None for publish)
```
1. **Publish:** An agent calls `agent_context.publish_message(message, topic_id)`. This internally calls the `AgentRuntime`'s publish method.
2. **Lookup:** The `AgentRuntime` takes the `topic_id` and consults its internal `Subscription Registry`.
3. **Match:** The Registry checks all registered `Subscription` objects. Each `Subscription` has an `is_match(topic_id)` method. The registry finds all subscriptions where `is_match` returns `True`.
4. **Map:** For each matching `Subscription`, the Runtime calls its `map_to_agent(topic_id)` method. This method returns the specific `AgentId` that should handle this message based on the subscription rule and the topic details.
5. **Deliver:** The `AgentRuntime` finds the agent instance corresponding to the returned `AgentId` (potentially creating it if it doesn't exist yet, especially with `TypeSubscription`). It then calls that agent's `on_message` method, delivering the original published `message`.
**Code Glimpse:**
* **`TopicId` (`_topic.py`):** As shown before, a simple dataclass holding `type` and `source`. It includes validation to ensure the `type` follows certain naming conventions.
```python
# From: _topic.py
@dataclass(eq=True, frozen=True)
class TopicId:
type: str
source: str
# ... validation and __str__ ...
@classmethod
def from_str(cls, topic_id: str) -> Self:
# Helper to parse "type/source" string
# ... implementation ...
```
* **`Subscription` Protocol (`_subscription.py`):** This defines the *contract* for any subscription rule.
```python
# From: _subscription.py (Simplified Protocol)
from typing import Protocol
# ... other imports
class Subscription(Protocol):
@property
def id(self) -> str: ... # Unique ID for this subscription instance
def is_match(self, topic_id: TopicId) -> bool:
"""Check if a topic matches this subscription's rule."""
...
def map_to_agent(self, topic_id: TopicId) -> AgentId:
"""Determine the target AgentId if is_match was True."""
...
```
Any class implementing these methods can act as a subscription rule.
* **`TypeSubscription` (`_type_subscription.py`):** A common implementation of the `Subscription` protocol.
```python
# From: _type_subscription.py (Simplified)
class TypeSubscription(Subscription):
def __init__(self, topic_type: str, agent_type: str, ...):
self._topic_type = topic_type
self._agent_type = agent_type
# ... generates a unique self._id ...
def is_match(self, topic_id: TopicId) -> bool:
# Matches if the topic's type is exactly the one we want
return topic_id.type == self._topic_type
def map_to_agent(self, topic_id: TopicId) -> AgentId:
# Maps to an agent of the specified type, using the
# topic's source as the agent's unique key.
if not self.is_match(topic_id):
raise CantHandleException(...) # Should not happen if used correctly
return AgentId(type=self._agent_type, key=topic_id.source)
# ... id property ...
```
This implementation provides the "one agent instance per source" behavior for a specific topic type.
* **`DefaultSubscription` (`_default_subscription.py`):** This is often used via a decorator (`@default_subscription`) and provides a convenient way to create a `TypeSubscription` where the `agent_type` is automatically inferred from the agent class being defined, and the `topic_type` defaults to "default" (but can be overridden). It simplifies common use cases.
```python
# From: _default_subscription.py (Conceptual Usage)
from autogen_core import BaseAgent, default_subscription, ResearchFacts
@default_subscription # Uses 'default' topic type, infers agent type 'writer'
class WriterAgent(BaseAgent):
# Agent logic here...
async def on_message_impl(self, message: ResearchFacts, ctx): ...
# Or specify the topic type
@default_subscription(topic_type="research.facts.available")
class SpecificWriterAgent(BaseAgent):
# Agent logic here...
async def on_message_impl(self, message: ResearchFacts, ctx): ...
```
The actual sending (`publish_message`) and routing logic reside within the `AgentRuntime`, which we'll explore next.
## Next Steps
You've learned how AutoGen Core uses a publish/subscribe system (`TopicId`, `Subscription`) to allow agents to communicate without direct coupling. This is crucial for building flexible and scalable multi-agent applications.
* **Topic (`TopicId`):** Named channels (`type`/`source`) for broadcasting messages.
* **Publish:** Sending a message to a Topic.
* **Subscription:** An agent's declared interest in messages on certain Topics, defining a routing rule.
Now, let's dive into the orchestrator that manages agents and makes this messaging system work:
* [Chapter 3: AgentRuntime](03_agentruntime.md): The manager responsible for creating, running, and connecting agents, including handling message publishing and subscription routing.
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,349 @@
# Chapter 3: AgentRuntime - The Office Manager
In [Chapter 1: Agent](01_agent.md), we met the workers (`Agent`) of our system. In [Chapter 2: Messaging System](02_messaging_system__topic___subscription_.md), we saw how they can communicate broadly using topics and subscriptions. But who hires these agents? Who actually delivers the messages, whether direct or published? And who keeps the whole system running smoothly?
This is where the **`AgentRuntime`** comes in. It's the central nervous system, the operating system, or perhaps the most fitting analogy: **the office manager** for all your agents.
## Motivation: Why Do We Need an Office Manager?
Imagine an office full of employees (Agents). You have researchers, writers, maybe coders.
* How does a new employee get hired and set up?
* When one employee wants to send a memo directly to another, who makes sure it gets to the right desk?
* When someone posts an announcement on the company bulletin board (publishes to a topic), who ensures everyone who signed up for that type of announcement sees it?
* Who starts the workday and ensures everything keeps running?
Without an office manager, it would be chaos! The `AgentRuntime` serves this crucial role in AutoGen Core. It handles:
1. **Agent Creation:** "Onboarding" new agents when they are needed.
2. **Message Routing:** Delivering direct messages (`send_message`) and published messages (`publish_message`).
3. **Lifecycle Management:** Starting, running, and stopping the whole system.
4. **State Management:** Keeping track of the overall system state (optional).
## Key Concepts: Understanding the Manager's Job
Let's break down the main responsibilities of the `AgentRuntime`:
1. **Agent Instantiation (Hiring):**
* You don't usually create agent objects directly (like `my_agent = ResearcherAgent()`). Why? Because the agent needs to know *about* the runtime (the office it works in) to send messages, publish announcements, etc.
* Instead, you tell the `AgentRuntime`: "I need an agent of type 'researcher'. Here's a recipe (a **factory function**) for how to create one." This is done using `runtime.register_factory(...)`.
* When a message needs to go to a 'researcher' agent with a specific key (e.g., 'researcher-01'), the runtime checks if it already exists. If not, it uses the registered factory function to create (instantiate) the agent.
* **Crucially**, while creating the agent, the runtime provides special context (`AgentInstantiationContext`) so the new agent automatically gets its unique `AgentId` and a reference to the `AgentRuntime` itself. This is like giving a new employee their ID badge and telling them who the office manager is.
```python
# Simplified Concept - How a BaseAgent gets its ID and runtime access
# From: _agent_instantiation.py and _base_agent.py
# Inside the agent's __init__ method (when inheriting from BaseAgent):
class MyAgent(BaseAgent):
def __init__(self, description: str):
# This magic happens *because* the AgentRuntime is creating the agent
# inside a special context.
self._runtime = AgentInstantiationContext.current_runtime() # Gets the manager
self._id = AgentInstantiationContext.current_agent_id() # Gets its own ID
self._description = description
# ... rest of initialization ...
```
This ensures agents are properly integrated into the system from the moment they are created.
2. **Message Delivery (Mail Room):**
* **Direct Send (`send_message`):** When an agent calls `await agent_context.send_message(message, recipient_id)`, it's actually telling the `AgentRuntime`, "Please deliver this `message` directly to the agent identified by `recipient_id`." The runtime finds the recipient agent (creating it if necessary) and calls its `on_message` method. It's like putting a specific name on an envelope and handing it to the mail room.
* **Publish (`publish_message`):** When an agent calls `await agent_context.publish_message(message, topic_id)`, it tells the runtime, "Post this `message` to the announcement board named `topic_id`." The runtime then checks its list of **subscriptions** (who signed up for which boards). For every matching subscription, it figures out the correct recipient agent(s) (based on the subscription rule) and delivers the message to their `on_message` method.
3. **Lifecycle Management (Opening/Closing the Office):**
* The runtime needs to be started to begin processing messages. Typically, you call `runtime.start()`. This usually kicks off a background process or loop that watches for incoming messages.
* When work is done, you need to stop the runtime gracefully. `runtime.stop_when_idle()` is common it waits until all messages currently in the queue have been processed, then stops. `runtime.stop()` stops more abruptly.
4. **State Management (Office Records):**
* The runtime can save the state of *all* the agents it manages (`runtime.save_state()`) and load it back later (`runtime.load_state()`). This is useful for pausing and resuming complex multi-agent interactions. It can also save/load state for individual agents (`runtime.agent_save_state()` / `runtime.agent_load_state()`). We'll touch more on state in [Chapter 7: Memory](07_memory.md).
## Use Case Example: Running Our Researcher and Writer
Let's finally run the Researcher/Writer scenario from Chapters 1 and 2. We need the `AgentRuntime` to make it happen.
**Goal:**
1. Create a runtime.
2. Register factories for a 'researcher' and a 'writer' agent.
3. Tell the runtime that 'writer' agents are interested in "research.facts.available" topics (add subscription).
4. Start the runtime.
5. Send an initial `ResearchTopic` message to a 'researcher' agent.
6. Let the system run (Researcher publishes facts, Runtime delivers to Writer via subscription, Writer processes).
7. Stop the runtime when idle.
**Code Snippets (Simplified):**
```python
# 0. Imports and Message Definitions (from previous chapters)
import asyncio
from dataclasses import dataclass
from autogen_core import (
AgentId, BaseAgent, SingleThreadedAgentRuntime, TopicId,
MessageContext, TypeSubscription, AgentInstantiationContext
)
@dataclass
class ResearchTopic: topic: str
@dataclass
class ResearchFacts: topic: str; facts: list[str]
```
These are the messages our agents will exchange.
```python
# 1. Define Agent Logic (using BaseAgent)
class ResearcherAgent(BaseAgent):
async def on_message_impl(self, message: ResearchTopic, ctx: MessageContext):
print(f"Researcher ({self.id}) got topic: {message.topic}")
facts = [f"Fact 1 about {message.topic}", f"Fact 2"]
results_topic = TopicId("research.facts.available", message.topic)
# Use the runtime (via self.publish_message helper) to publish
await self.publish_message(
ResearchFacts(topic=message.topic, facts=facts), results_topic
)
print(f"Researcher ({self.id}) published facts to {results_topic}")
class WriterAgent(BaseAgent):
async def on_message_impl(self, message: ResearchFacts, ctx: MessageContext):
print(f"Writer ({self.id}) received facts via topic '{ctx.topic_id}': {message.facts}")
draft = f"Draft for {message.topic}: {'; '.join(message.facts)}"
print(f"Writer ({self.id}) created draft: '{draft}'")
# This agent doesn't send further messages in this example
```
Here we define the behavior of our two agent types, inheriting from `BaseAgent` which gives us `self.id`, `self.publish_message`, etc.
```python
# 2. Define Agent Factories
def researcher_factory():
# Gets runtime/id via AgentInstantiationContext inside BaseAgent.__init__
print("Runtime is creating a ResearcherAgent...")
return ResearcherAgent(description="I research topics.")
def writer_factory():
print("Runtime is creating a WriterAgent...")
return WriterAgent(description="I write drafts from facts.")
```
These simple functions tell the runtime *how* to create instances of our agents when needed.
```python
# 3. Setup and Run the Runtime
async def main():
# Create the runtime (the office manager)
runtime = SingleThreadedAgentRuntime()
# Register the factories (tell the manager how to hire)
await runtime.register_factory("researcher", researcher_factory)
await runtime.register_factory("writer", writer_factory)
print("Registered agent factories.")
# Add the subscription (tell manager who listens to which announcements)
# Rule: Messages to topics of type "research.facts.available"
# should go to a "writer" agent whose key matches the topic source.
writer_sub = TypeSubscription(topic_type="research.facts.available", agent_type="writer")
await runtime.add_subscription(writer_sub)
print(f"Added subscription: {writer_sub.id}")
# Start the runtime (open the office)
runtime.start()
print("Runtime started.")
# Send the initial message to kick things off
research_task_topic = "AutoGen Agents"
researcher_instance_id = AgentId(type="researcher", key=research_task_topic)
print(f"Sending initial topic '{research_task_topic}' to {researcher_instance_id}")
await runtime.send_message(
message=ResearchTopic(topic=research_task_topic),
recipient=researcher_instance_id,
)
# Wait until all messages are processed (wait for work day to end)
print("Waiting for runtime to become idle...")
await runtime.stop_when_idle()
print("Runtime stopped.")
# Run the main function
asyncio.run(main())
```
This script sets up the `SingleThreadedAgentRuntime`, registers the blueprints (factories) and communication rules (subscription), starts the process, and then shuts down cleanly.
**Expected Output (Conceptual Order):**
```
Registered agent factories.
Added subscription: type=research.facts.available=>agent=writer
Runtime started.
Sending initial topic 'AutoGen Agents' to researcher/AutoGen Agents
Waiting for runtime to become idle...
Runtime is creating a ResearcherAgent... # First time researcher/AutoGen Agents is needed
Researcher (researcher/AutoGen Agents) got topic: AutoGen Agents
Researcher (researcher/AutoGen Agents) published facts to research.facts.available/AutoGen Agents
Runtime is creating a WriterAgent... # First time writer/AutoGen Agents is needed (due to subscription)
Writer (writer/AutoGen Agents) received facts via topic 'research.facts.available/AutoGen Agents': ['Fact 1 about AutoGen Agents', 'Fact 2']
Writer (writer/AutoGen Agents) created draft: 'Draft for AutoGen Agents: Fact 1 about AutoGen Agents; Fact 2'
Runtime stopped.
```
You can see the runtime orchestrating the creation of agents and the flow of messages based on the initial request and the subscription rule.
## Under the Hood: How the Manager Works
Let's peek inside the `SingleThreadedAgentRuntime` (a common implementation provided by AutoGen Core) to understand the flow.
**Core Idea:** It uses an internal queue (`_message_queue`) to hold incoming requests (`send_message`, `publish_message`). A background task continuously takes items from the queue and processes them one by one (though the *handling* of a message might involve `await` and allow other tasks to run).
**1. Agent Creation (`_get_agent`, `_invoke_agent_factory`)**
When the runtime needs an agent instance (e.g., to deliver a message) that hasn't been created yet:
```mermaid
sequenceDiagram
participant Runtime as AgentRuntime
participant Factory as Agent Factory Func
participant AgentCtx as AgentInstantiationContext
participant Agent as New Agent Instance
Runtime->>Runtime: Check if agent instance exists (e.g., in `_instantiated_agents` dict)
alt Agent Not Found
Runtime->>Runtime: Find registered factory for agent type
Runtime->>AgentCtx: Set current runtime & agent_id
activate AgentCtx
Runtime->>Factory: Call factory function()
activate Factory
Factory->>AgentCtx: (Inside Agent.__init__) Get current runtime
AgentCtx-->>Factory: Return runtime
Factory->>AgentCtx: (Inside Agent.__init__) Get current agent_id
AgentCtx-->>Factory: Return agent_id
Factory-->>Runtime: Return new Agent instance
deactivate Factory
Runtime->>AgentCtx: Clear context
deactivate AgentCtx
Runtime->>Runtime: Store new agent instance
end
Runtime->>Runtime: Return agent instance
```
* The runtime looks up the factory function registered for the required `AgentId.type`.
* It uses `AgentInstantiationContext.populate_context` to temporarily store its own reference and the target `AgentId`.
* It calls the factory function.
* Inside the agent's `__init__` (usually via `BaseAgent`), `AgentInstantiationContext.current_runtime()` and `AgentInstantiationContext.current_agent_id()` are called to retrieve the context set by the runtime.
* The factory returns the fully initialized agent instance.
* The runtime stores this instance for future use.
```python
# From: _agent_instantiation.py (Simplified)
class AgentInstantiationContext:
_CONTEXT_VAR = ContextVar("agent_context") # Stores (runtime, agent_id)
@classmethod
@contextmanager
def populate_context(cls, ctx: tuple[AgentRuntime, AgentId]):
token = cls._CONTEXT_VAR.set(ctx) # Store context for this block
try:
yield # Code inside the 'with' block runs here
finally:
cls._CONTEXT_VAR.reset(token) # Clean up context
@classmethod
def current_runtime(cls) -> AgentRuntime:
return cls._CONTEXT_VAR.get()[0] # Retrieve runtime from context
@classmethod
def current_agent_id(cls) -> AgentId:
return cls._CONTEXT_VAR.get()[1] # Retrieve agent_id from context
```
This context manager pattern ensures the correct runtime and ID are available *only* during the agent's creation by the runtime.
**2. Direct Messaging (`send_message` -> `_process_send`)**
```mermaid
sequenceDiagram
participant Sender as Sending Agent/Code
participant Runtime as AgentRuntime
participant Queue as Internal Queue
participant Recipient as Recipient Agent
Sender->>+Runtime: send_message(msg, recipient_id, ...)
Runtime->>Runtime: Create Future (for response)
Runtime->>+Queue: Put SendMessageEnvelope(msg, recipient_id, future)
Runtime-->>-Sender: Return awaitable Future
Note over Queue, Runtime: Background task picks up envelope
Runtime->>Runtime: _process_send(envelope)
Runtime->>+Recipient: _get_agent(recipient_id) (creates if needed)
Recipient-->>-Runtime: Return Agent instance
Runtime->>+Recipient: on_message(msg, context)
Recipient->>Recipient: Process message...
Recipient-->>-Runtime: Return response value
Runtime->>Runtime: Set Future result with response value
```
* `send_message` creates a `Future` object (a placeholder for the eventual result) and wraps the message details in a `SendMessageEnvelope`.
* This envelope is put onto the internal `_message_queue`.
* The background task picks up the envelope.
* `_process_send` gets the recipient agent instance (using `_get_agent`).
* It calls the recipient's `on_message` method.
* When `on_message` returns a result, `_process_send` sets the result on the `Future` object, which makes the original `await runtime.send_message(...)` call return the value.
**3. Publish/Subscribe (`publish_message` -> `_process_publish`)**
```mermaid
sequenceDiagram
participant Publisher as Publishing Agent/Code
participant Runtime as AgentRuntime
participant Queue as Internal Queue
participant SubManager as SubscriptionManager
participant Subscriber as Subscribed Agent
Publisher->>+Runtime: publish_message(msg, topic_id, ...)
Runtime->>+Queue: Put PublishMessageEnvelope(msg, topic_id)
Runtime-->>-Publisher: Return (None for publish)
Note over Queue, Runtime: Background task picks up envelope
Runtime->>Runtime: _process_publish(envelope)
Runtime->>+SubManager: get_subscribed_recipients(topic_id)
SubManager->>SubManager: Find matching subscriptions
SubManager->>SubManager: Map subscriptions to AgentIds
SubManager-->>-Runtime: Return list of recipient AgentIds
loop For each recipient AgentId
Runtime->>+Subscriber: _get_agent(recipient_id) (creates if needed)
Subscriber-->>-Runtime: Return Agent instance
Runtime->>+Subscriber: on_message(msg, context with topic_id)
Subscriber->>Subscriber: Process message...
Subscriber-->>-Runtime: Return (usually None for publish)
end
```
* `publish_message` wraps the message in a `PublishMessageEnvelope` and puts it on the queue.
* The background task picks it up.
* `_process_publish` asks the `SubscriptionManager` (`_subscription_manager`) for all `AgentId`s that are subscribed to the given `topic_id`.
* The `SubscriptionManager` checks its registered `Subscription` objects (`_subscriptions` list, added via `add_subscription`). For each `Subscription` where `is_match(topic_id)` is true, it calls `map_to_agent(topic_id)` to get the target `AgentId`.
* For each resulting `AgentId`, the runtime gets the agent instance and calls its `on_message` method, providing the `topic_id` in the `MessageContext`.
```python
# From: _runtime_impl_helpers.py (SubscriptionManager simplified)
class SubscriptionManager:
def __init__(self):
self._subscriptions: List[Subscription] = []
# Optimization cache can be added here
async def add_subscription(self, subscription: Subscription):
self._subscriptions.append(subscription)
# Clear cache if any
async def get_subscribed_recipients(self, topic: TopicId) -> List[AgentId]:
recipients = []
for sub in self._subscriptions:
if sub.is_match(topic):
recipients.append(sub.map_to_agent(topic))
return recipients
```
The `SubscriptionManager` simply iterates through registered subscriptions to find matches when a message is published.
## Next Steps
You now understand the `AgentRuntime` - the essential coordinator that brings Agents to life, manages their communication, and runs the entire show. It handles agent creation via factories, routes direct and published messages, and manages the system's lifecycle.
With the core concepts of `Agent`, `Messaging`, and `AgentRuntime` covered, we can start looking at more specialized building blocks. Next, we'll explore how agents can use external capabilities:
* [Chapter 4: Tool](04_tool.md): How to give agents tools (like functions or APIs) to perform specific actions beyond just processing messages.
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,272 @@
# Chapter 4: Tool - Giving Agents Specific Capabilities
In the previous chapters, we learned about Agents as workers ([Chapter 1](01_agent.md)), how they can communicate directly or using announcements ([Chapter 2](02_messaging_system__topic___subscription_.md)), and the `AgentRuntime` that manages them ([Chapter 3](03_agentruntime.md)).
Agents can process messages and coordinate, but what if an agent needs to perform a very specific action, like looking up information online, running a piece of code, accessing a database, or even just finding out the current date? They need specialized *capabilities*.
This is where the concept of a **Tool** comes in.
## Motivation: Agents Need Skills!
Imagine our `Writer` agent from before. It receives facts and writes a draft. Now, let's say we want the `Writer` (or perhaps a smarter `Assistant` agent helping it) to always include the current date in the blog post title.
How does the agent get the current date? It doesn't inherently know it. It needs a specific *skill* or *tool* for that.
A `Tool` in AutoGen Core represents exactly this: a specific, well-defined capability that an Agent can use. Think of it like giving an employee (Agent) a specialized piece of equipment (Tool), like a calculator, a web browser, or a calendar lookup program.
## Key Concepts: Understanding Tools
Let's break down what defines a Tool:
1. **It's a Specific Capability:** A Tool performs one well-defined task. Examples:
* `search_web(query: str)`
* `run_python_code(code: str)`
* `get_stock_price(ticker: str)`
* `get_current_date()`
2. **It Has a Schema (The Manual):** This is crucial! For an Agent (especially one powered by a Large Language Model - LLM) to know *when* and *how* to use a tool, the tool needs a clear description or "manual". This is called the `ToolSchema`. It typically includes:
* **`name`**: A unique identifier for the tool (e.g., `get_current_date`).
* **`description`**: A clear explanation of what the tool does, which helps the LLM decide if this tool is appropriate for the current task (e.g., "Fetches the current date in YYYY-MM-DD format").
* **`parameters`**: Defines what inputs the tool needs. This is itself a schema (`ParametersSchema`) describing the input fields, their types, and which ones are required. For our `get_current_date` example, it might need no parameters. For `get_stock_price`, it would need a `ticker` parameter of type string.
```python
# From: tools/_base.py (Simplified Concept)
from typing import TypedDict, Dict, Any, Sequence, NotRequired
class ParametersSchema(TypedDict):
type: str # Usually "object"
properties: Dict[str, Any] # Defines input fields and their types
required: NotRequired[Sequence[str]] # List of required field names
class ToolSchema(TypedDict):
name: str
description: NotRequired[str]
parameters: NotRequired[ParametersSchema]
# 'strict' flag also possible (Chapter 5 related)
```
This schema allows an LLM to understand: "Ah, there's a tool called `get_current_date` that takes no inputs and gives me the current date. I should use that now!"
3. **It Can Be Executed:** Once an agent decides to use a tool (often based on the schema), there needs to be a mechanism to actually *run* the tool's underlying function and get the result.
## Use Case Example: Adding a `get_current_date` Tool
Let's equip an agent with the ability to find the current date.
**Goal:** Define a tool that gets the current date and show how it could be executed by a specialized agent.
**Step 1: Define the Python Function**
First, we need the actual Python code that performs the action.
```python
# File: get_date_function.py
import datetime
def get_current_date() -> str:
"""Fetches the current date as a string."""
today = datetime.date.today()
return today.isoformat() # Returns date like "2023-10-27"
# Test the function
print(f"Function output: {get_current_date()}")
```
This is a standard Python function. It takes no arguments and returns the date as a string.
**Step 2: Wrap it as a `FunctionTool`**
AutoGen Core provides a convenient way to turn a Python function like this into a `Tool` object using `FunctionTool`. It automatically inspects the function's signature (arguments and return type) and docstring to help build the `ToolSchema`.
```python
# File: create_date_tool.py
from autogen_core.tools import FunctionTool
from get_date_function import get_current_date # Import our function
# Create the Tool instance
# We provide the function and a clear description for the LLM
date_tool = FunctionTool(
func=get_current_date,
description="Use this tool to get the current date in YYYY-MM-DD format."
# Name defaults to function name 'get_current_date'
)
# Let's see what FunctionTool generated
print(f"Tool Name: {date_tool.name}")
print(f"Tool Description: {date_tool.description}")
# The schema defines inputs (none in this case)
# print(f"Tool Schema Parameters: {date_tool.schema['parameters']}")
# Output (simplified): {'type': 'object', 'properties': {}, 'required': []}
```
`FunctionTool` wraps our `get_current_date` function. It uses the function name as the tool name and the description we provided. It also correctly determines from the function signature that there are no input parameters (`properties: {}`).
**Step 3: How an Agent Might Request Tool Use**
Now we have a `date_tool`. How is it used? Typically, an LLM-powered agent (which we'll see more of in [Chapter 5: ChatCompletionClient](05_chatcompletionclient.md)) analyzes a request and decides a tool is needed. It then generates a request to *call* that tool, often using a specific message type like `FunctionCall`.
```python
# File: tool_call_request.py
from autogen_core import FunctionCall # Represents a request to call a tool
# Imagine an LLM agent decided to use the date tool.
# It constructs this message, providing the tool name and arguments (as JSON string).
date_call_request = FunctionCall(
id="call_date_001", # A unique ID for this specific call attempt
name="get_current_date", # Matches the Tool's name
arguments="{}" # An empty JSON object because no arguments are needed
)
print("FunctionCall message:", date_call_request)
# Output: FunctionCall(id='call_date_001', name='get_current_date', arguments='{}')
```
This `FunctionCall` message is like a work order: "Please execute the tool named `get_current_date` with these arguments."
**Step 4: The `ToolAgent` Executes the Tool**
Who receives this `FunctionCall` message? Usually, a specialized agent called `ToolAgent`. You create a `ToolAgent` and give it the list of tools it knows how to execute. When it receives a `FunctionCall`, it finds the matching tool and runs it.
```python
# File: tool_agent_example.py
import asyncio
from autogen_core.tool_agent import ToolAgent
from autogen_core.models import FunctionExecutionResult
from create_date_tool import date_tool # Import the tool we created
from tool_call_request import date_call_request # Import the request message
# Create an agent specifically designed to execute tools
tool_executor = ToolAgent(
description="I can execute tools like getting the date.",
tools=[date_tool] # Give it the list of tools it manages
)
# --- Simulation of Runtime delivering the message ---
# In a real app, the AgentRuntime (Chapter 3) would route the
# date_call_request message to this tool_executor agent.
# We simulate the call to its message handler here:
async def simulate_execution():
# Fake context (normally provided by runtime)
class MockContext: cancellation_token = None
ctx = MockContext()
print(f"ToolAgent received request: {date_call_request.name}")
result: FunctionExecutionResult = await tool_executor.handle_function_call(
message=date_call_request,
ctx=ctx
)
print(f"ToolAgent produced result: {result}")
asyncio.run(simulate_execution())
```
**Expected Output:**
```
ToolAgent received request: get_current_date
ToolAgent produced result: FunctionExecutionResult(content='2023-10-27', call_id='call_date_001', is_error=False, name='get_current_date') # Date will be current date
```
The `ToolAgent` received the `FunctionCall`, found the `date_tool` in its list, executed the underlying `get_current_date` function, and packaged the result (the date string) into a `FunctionExecutionResult` message. This result message can then be sent back to the agent that originally requested the tool use.
## Under the Hood: How Tool Execution Works
Let's visualize the typical flow when an LLM agent decides to use a tool managed by a `ToolAgent`.
**Conceptual Flow:**
```mermaid
sequenceDiagram
participant LLMA as LLM Agent (Decides)
participant Caller as Caller Agent (Orchestrates)
participant ToolA as ToolAgent (Executes)
participant ToolFunc as Tool Function (e.g., get_current_date)
Note over LLMA: Analyzes conversation, decides tool needed.
LLMA->>Caller: Sends AssistantMessage containing FunctionCall(name='get_current_date', args='{}')
Note over Caller: Receives LLM response, sees FunctionCall.
Caller->>+ToolA: Uses runtime.send_message(message=FunctionCall, recipient=ToolAgent_ID)
Note over ToolA: Receives FunctionCall via on_message.
ToolA->>ToolA: Looks up 'get_current_date' in its internal list of Tools.
ToolA->>+ToolFunc: Calls tool.run_json(args={}) -> triggers get_current_date()
ToolFunc-->>-ToolA: Returns the result (e.g., "2023-10-27")
ToolA->>ToolA: Creates FunctionExecutionResult message with the content.
ToolA-->>-Caller: Returns FunctionExecutionResult via runtime messaging.
Note over Caller: Receives the tool result.
Caller->>LLMA: Sends FunctionExecutionResultMessage to LLM for next step.
Note over LLMA: Now knows the current date.
```
1. **Decision:** An LLM-powered agent decides a tool is needed based on the conversation and the available tools' descriptions. It generates a `FunctionCall`.
2. **Request:** A "Caller" agent (often the same LLM agent or a managing agent) sends this `FunctionCall` message to the dedicated `ToolAgent` using the `AgentRuntime`.
3. **Lookup:** The `ToolAgent` receives the message, extracts the tool `name` (`get_current_date`), and finds the corresponding `Tool` object (our `date_tool`) in the list it was configured with.
4. **Execution:** The `ToolAgent` calls the `run_json` method on the `Tool` object, passing the arguments from the `FunctionCall`. For a `FunctionTool`, `run_json` validates the arguments against the generated schema and then executes the original Python function (`get_current_date`).
5. **Result:** The Python function returns its result (the date string).
6. **Response:** The `ToolAgent` wraps this result string in a `FunctionExecutionResult` message, including the original `call_id`, and sends it back to the Caller agent.
7. **Continuation:** The Caller agent typically sends this result back to the LLM agent, allowing the conversation or task to continue with the new information.
**Code Glimpse:**
* **`Tool` Protocol (`tools/_base.py`):** Defines the basic contract any tool must fulfill. Key methods are `schema` (property returning the `ToolSchema`) and `run_json` (method to execute the tool with JSON-like arguments).
* **`BaseTool` (`tools/_base.py`):** An abstract class that helps implement the `Tool` protocol, especially using Pydantic models for defining arguments (`args_type`) and return values (`return_type`). It automatically generates the `parameters` part of the schema from the `args_type` model.
* **`FunctionTool` (`tools/_function_tool.py`):** Inherits from `BaseTool`. Its magic lies in automatically creating the `args_type` Pydantic model by inspecting the wrapped Python function's signature (`args_base_model_from_signature`). Its `run` method handles calling the original sync or async Python function.
```python
# Inside FunctionTool (Simplified Concept)
class FunctionTool(BaseTool[BaseModel, BaseModel]):
def __init__(self, func, description, ...):
self._func = func
self._signature = get_typed_signature(func)
# Automatically create Pydantic model for arguments
args_model = args_base_model_from_signature(...)
# Get return type from signature
return_type = self._signature.return_annotation
super().__init__(args_model, return_type, ...)
async def run(self, args: BaseModel, ...):
# Extract arguments from the 'args' model
kwargs = args.model_dump()
# Call the original Python function (sync or async)
result = await self._call_underlying_func(**kwargs)
return result # Must match the expected return_type
```
* **`ToolAgent` (`tool_agent/_tool_agent.py`):** A specialized `RoutedAgent`. It registers a handler specifically for `FunctionCall` messages.
```python
# Inside ToolAgent (Simplified Concept)
class ToolAgent(RoutedAgent):
def __init__(self, ..., tools: List[Tool]):
super().__init__(...)
self._tools = {tool.name: tool for tool in tools} # Store tools by name
@message_handler # Registers this for FunctionCall messages
async def handle_function_call(self, message: FunctionCall, ctx: MessageContext):
# Find the tool by name
tool = self._tools.get(message.name)
if tool is None:
# Handle error: Tool not found
raise ToolNotFoundException(...)
try:
# Parse arguments string into a dictionary
arguments = json.loads(message.arguments)
# Execute the tool's run_json method
result_obj = await tool.run_json(args=arguments, ...)
# Convert result object back to string if needed
result_str = tool.return_value_as_string(result_obj)
# Create the success result message
return FunctionExecutionResult(content=result_str, ...)
except Exception as e:
# Handle execution errors
return FunctionExecutionResult(content=f"Error: {e}", is_error=True, ...)
```
Its core logic is: find tool -> parse args -> run tool -> return result/error.
## Next Steps
You've learned how **Tools** provide specific capabilities to Agents, defined by a **Schema** that LLMs can understand. We saw how `FunctionTool` makes it easy to wrap existing Python functions and how `ToolAgent` acts as the executor for these tools.
This ability for agents to use tools is fundamental to building powerful and versatile AI systems that can interact with the real world or perform complex calculations.
Now that agents can use tools, we need to understand more about the agents that *decide* which tools to use, which often involves interacting with Large Language Models:
* [Chapter 5: ChatCompletionClient](05_chatcompletionclient.md): How agents interact with LLMs like GPT to generate responses or decide on actions (like calling a tool).
* [Chapter 6: ChatCompletionContext](06_chatcompletioncontext.md): How the history of the conversation, including tool calls and results, is managed when talking to an LLM.
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,296 @@
# Chapter 5: ChatCompletionClient - Talking to the Brains
So far, we've learned about:
* [Agents](01_agent.md): The workers in our system.
* [Messaging](02_messaging_system__topic___subscription_.md): How agents communicate broadly.
* [AgentRuntime](03_agentruntime.md): The manager that runs the show.
* [Tools](04_tool.md): How agents get specific skills.
But how does an agent actually *think* or *generate text*? Many powerful agents rely on Large Language Models (LLMs) think of models like GPT-4, Claude, or Gemini as their "brains". How does an agent in AutoGen Core communicate with these external LLM services?
This is where the **`ChatCompletionClient`** comes in. It's the dedicated component for talking to LLMs.
## Motivation: Bridging the Gap to LLMs
Imagine you want to build an agent that can summarize long articles.
1. You give the agent an article (as a message).
2. The agent needs to send this article to an LLM (like GPT-4).
3. It also needs to tell the LLM: "Please summarize this."
4. The LLM processes the request and generates a summary.
5. The agent needs to receive this summary back from the LLM.
How does the agent handle the technical details of connecting to the LLM's specific API, formatting the request correctly, sending it over the internet, and understanding the response?
The `ChatCompletionClient` solves this! Think of it as the **standard phone line and translator** connecting your agent to the LLM service. You tell the client *what* to say (the conversation history and instructions), and it handles *how* to say it to the specific LLM and translates the LLM's reply back into a standard format.
## Key Concepts: Understanding the LLM Communicator
Let's break down the `ChatCompletionClient`:
1. **LLM Communication Bridge:** It's the primary way AutoGen agents interact with external LLM APIs (like OpenAI, Anthropic, Google Gemini, etc.). It hides the complexity of specific API calls.
2. **Standard Interface (`create` method):** It defines a common way to send requests and receive responses, regardless of the underlying LLM. The core method is `create`. You give it:
* `messages`: A list of messages representing the conversation history so far.
* Optional `tools`: A list of tools ([Chapter 4](04_tool.md)) the LLM might be able to use.
* Other parameters (like `json_output` hints, `cancellation_token`).
3. **Messages (`LLMMessage`):** The conversation history is passed as a sequence of specific message types defined in `autogen_core.models`:
* `SystemMessage`: Instructions for the LLM (e.g., "You are a helpful assistant.").
* `UserMessage`: Input from the user or another agent (e.g., the article text).
* `AssistantMessage`: Previous responses from the LLM (can include text or requests to call functions/tools).
* `FunctionExecutionResultMessage`: The results of executing a tool/function call.
4. **Tools (`ToolSchema`):** You can provide the schemas of available tools ([Chapter 4](04_tool.md)). The LLM might then respond not with text, but with a request to call one of these tools (`FunctionCall` inside an `AssistantMessage`).
5. **Response (`CreateResult`):** The `create` method returns a standard `CreateResult` object containing:
* `content`: The LLM's generated text or a list of `FunctionCall` requests.
* `finish_reason`: Why the LLM stopped generating (e.g., "stop", "length", "function_calls").
* `usage`: How many input (`prompt_tokens`) and output (`completion_tokens`) tokens were used.
* `cached`: Whether the response came from a cache.
6. **Token Tracking:** The client automatically tracks token usage (`prompt_tokens`, `completion_tokens`) for each call. You can query the total usage via methods like `total_usage()`. This is vital for monitoring costs, as most LLM APIs charge based on tokens.
## Use Case Example: Summarizing Text with an LLM
Let's build a simplified scenario where we use a `ChatCompletionClient` to ask an LLM to summarize text.
**Goal:** Send text to an LLM via a client and get a summary back.
**Step 1: Prepare the Input Messages**
We need to structure our request as a list of `LLMMessage` objects.
```python
# File: prepare_messages.py
from autogen_core.models import SystemMessage, UserMessage
# Instructions for the LLM
system_prompt = SystemMessage(
content="You are a helpful assistant designed to summarize text concisely."
)
# The text we want to summarize
article_text = """
AutoGen is a framework that enables the development of LLM applications using multiple agents
that can converse with each other to solve tasks. AutoGen agents are customizable,
conversable, and can seamlessly allow human participation. They can operate in various modes
that employ combinations of LLMs, human inputs, and tools.
"""
user_request = UserMessage(
content=f"Please summarize the following text in one sentence:\n\n{article_text}",
source="User" # Indicate who provided this input
)
# Combine into a list for the client
messages_to_send = [system_prompt, user_request]
print("Messages prepared:")
for msg in messages_to_send:
print(f"- {msg.type}: {msg.content[:50]}...") # Print first 50 chars
```
This code defines the instructions (`SystemMessage`) and the user's request (`UserMessage`) and puts them in a list, ready to be sent.
**Step 2: Use the ChatCompletionClient (Conceptual)**
Now, we need an instance of a `ChatCompletionClient`. In a real application, you'd configure a specific client (like `OpenAIChatCompletionClient` with your API key). For this example, let's imagine we have a pre-configured client called `llm_client`.
```python
# File: call_llm_client.py
import asyncio
from autogen_core.models import CreateResult, RequestUsage
# Assume 'messages_to_send' is from the previous step
# Assume 'llm_client' is a pre-configured ChatCompletionClient instance
# (e.g., llm_client = OpenAIChatCompletionClient(config=...))
async def get_summary(client, messages):
print("\nSending messages to LLM via ChatCompletionClient...")
try:
# The core call: send messages, get structured result
response: CreateResult = await client.create(
messages=messages,
# We aren't providing tools in this simple example
tools=[]
)
print("Received response:")
print(f"- Finish Reason: {response.finish_reason}")
print(f"- Content: {response.content}") # This should be the summary
print(f"- Usage (Tokens): Prompt={response.usage.prompt_tokens}, Completion={response.usage.completion_tokens}")
print(f"- Cached: {response.cached}")
# Also, check total usage tracked by the client
total_usage = client.total_usage()
print(f"\nClient Total Usage: Prompt={total_usage.prompt_tokens}, Completion={total_usage.completion_tokens}")
except Exception as e:
print(f"An error occurred: {e}")
# --- Placeholder for actual client ---
class MockChatCompletionClient: # Simulate a real client
_total_usage = RequestUsage(prompt_tokens=0, completion_tokens=0)
async def create(self, messages, tools=[], **kwargs) -> CreateResult:
# Simulate API call and response
prompt_len = sum(len(str(m.content)) for m in messages) // 4 # Rough token estimate
summary = "AutoGen is a multi-agent framework for developing LLM applications."
completion_len = len(summary) // 4 # Rough token estimate
usage = RequestUsage(prompt_tokens=prompt_len, completion_tokens=completion_len)
self._total_usage.prompt_tokens += usage.prompt_tokens
self._total_usage.completion_tokens += usage.completion_tokens
return CreateResult(
finish_reason="stop", content=summary, usage=usage, cached=False
)
def total_usage(self) -> RequestUsage: return self._total_usage
# Other required methods (count_tokens, model_info etc.) omitted for brevity
async def main():
from prepare_messages import messages_to_send # Get messages from previous step
mock_client = MockChatCompletionClient()
await get_summary(mock_client, messages_to_send)
# asyncio.run(main()) # If you run this, it uses the mock client
```
This code shows the essential `client.create(...)` call. We pass our `messages_to_send` and receive a `CreateResult`. We then print the summary (`response.content`) and the token usage reported for that specific call (`response.usage`) and the total tracked by the client (`client.total_usage()`).
**How an Agent Uses It:**
Typically, an agent's logic (e.g., inside its `on_message` handler) would:
1. Receive an incoming message (like the article to summarize).
2. Prepare the list of `LLMMessage` objects (including system prompts, history, and the new request).
3. Access a `ChatCompletionClient` instance (often provided during agent setup or accessed via its context).
4. Call `await client.create(...)`.
5. Process the `CreateResult` (e.g., extract the summary text, check for function calls if tools were provided).
6. Potentially send the result as a new message to another agent or return it.
## Under the Hood: How the Client Talks to the LLM
What happens when you call `await client.create(...)`?
**Conceptual Flow:**
```mermaid
sequenceDiagram
participant Agent as Agent Logic
participant Client as ChatCompletionClient
participant Formatter as API Formatter
participant HTTP as HTTP Client
participant LLM_API as External LLM API
Agent->>+Client: create(messages, tools)
Client->>+Formatter: Format messages & tools for specific API (e.g., OpenAI JSON format)
Formatter-->>-Client: Return formatted request body
Client->>+HTTP: Send POST request to LLM API endpoint with formatted body & API Key
HTTP->>+LLM_API: Transmit request over network
LLM_API->>LLM_API: Process request, generate completion/function call
LLM_API-->>-HTTP: Return API response (e.g., JSON)
HTTP-->>-Client: Receive HTTP response
Client->>+Formatter: Parse API response (extract content, usage, finish_reason)
Formatter-->>-Client: Return parsed data
Client->>Client: Create standard CreateResult object
Client-->>-Agent: Return CreateResult
```
1. **Prepare:** The `ChatCompletionClient` takes the standard `LLMMessage` list and `ToolSchema` list.
2. **Format:** It translates these into the specific format required by the target LLM's API (e.g., the JSON structure expected by OpenAI's `/chat/completions` endpoint). This might involve renaming roles (like `SystemMessage` to `system`), formatting tool descriptions, etc.
3. **Request:** It uses an underlying HTTP client to send a network request (usually a POST request) to the LLM service's API endpoint, including the formatted data and authentication (like an API key).
4. **Wait & Receive:** It waits for the LLM service to process the request and send back a response over the network.
5. **Parse:** It receives the raw HTTP response (usually JSON) from the API.
6. **Standardize:** It parses this specific API response, extracting the generated text or function calls, token usage figures, finish reason, etc.
7. **Return:** It packages all this information into a standard `CreateResult` object and returns it to the calling agent code.
**Code Glimpse:**
* **`ChatCompletionClient` Protocol (`models/_model_client.py`):** This is the abstract base class (or protocol) defining the *contract* that all specific clients must follow.
```python
# From: models/_model_client.py (Simplified ABC)
from abc import ABC, abstractmethod
from typing import Sequence, Optional, Mapping, Any, AsyncGenerator, Union
from ._types import LLMMessage, CreateResult, RequestUsage
from ..tools import Tool, ToolSchema
from .. import CancellationToken
class ChatCompletionClient(ABC):
@abstractmethod
async def create(
self, messages: Sequence[LLMMessage], *,
tools: Sequence[Tool | ToolSchema] = [],
json_output: Optional[bool] = None, # Hint for JSON mode
extra_create_args: Mapping[str, Any] = {}, # API-specific args
cancellation_token: Optional[CancellationToken] = None,
) -> CreateResult: ... # The core method
@abstractmethod
def create_stream(
self, # Similar to create, but yields results incrementally
# ... parameters ...
) -> AsyncGenerator[Union[str, CreateResult], None]: ...
@abstractmethod
def total_usage(self) -> RequestUsage: ... # Get total tracked usage
@abstractmethod
def count_tokens(self, messages: Sequence[LLMMessage], *, tools: Sequence[Tool | ToolSchema] = []) -> int: ... # Estimate token count
# Other methods like close(), actual_usage(), remaining_tokens(), model_info...
```
Concrete classes like `OpenAIChatCompletionClient`, `AnthropicChatCompletionClient` etc., implement these methods using the specific libraries and API calls for each service.
* **`LLMMessage` Types (`models/_types.py`):** These define the structure of messages passed *to* the client.
```python
# From: models/_types.py (Simplified)
from pydantic import BaseModel
from typing import List, Union, Literal
from .. import FunctionCall # From Chapter 4 context
class SystemMessage(BaseModel):
content: str
type: Literal["SystemMessage"] = "SystemMessage"
class UserMessage(BaseModel):
content: Union[str, List[Union[str, Image]]] # Can include images!
source: str
type: Literal["UserMessage"] = "UserMessage"
class AssistantMessage(BaseModel):
content: Union[str, List[FunctionCall]] # Can be text or function calls
source: str
type: Literal["AssistantMessage"] = "AssistantMessage"
# FunctionExecutionResultMessage also exists here...
```
* **`CreateResult` (`models/_types.py`):** This defines the structure of the response *from* the client.
```python
# From: models/_types.py (Simplified)
from pydantic import BaseModel
from dataclasses import dataclass
from typing import Union, List, Optional
from .. import FunctionCall
@dataclass
class RequestUsage:
prompt_tokens: int
completion_tokens: int
FinishReasons = Literal["stop", "length", "function_calls", "content_filter", "unknown"]
class CreateResult(BaseModel):
finish_reason: FinishReasons
content: Union[str, List[FunctionCall]] # LLM output
usage: RequestUsage # Token usage for this call
cached: bool
# Optional fields like logprobs, thought...
```
Using these standard types ensures that agent logic can work consistently, even if you switch the underlying LLM service by using a different `ChatCompletionClient` implementation.
## Next Steps
You now understand the role of `ChatCompletionClient` as the crucial link between AutoGen agents and the powerful capabilities of Large Language Models. It provides a standard way to send conversational history and tool definitions, receive generated text or function call requests, and track token usage.
Managing the conversation history (`messages`) sent to the client is very important. How do you ensure the LLM has the right context, especially after tool calls have happened?
* [Chapter 6: ChatCompletionContext](06_chatcompletioncontext.md): Learn how AutoGen helps manage the conversation history, including adding tool call requests and their results, before sending it to the `ChatCompletionClient`.
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,330 @@
# Chapter 6: ChatCompletionContext - Remembering the Conversation
In [Chapter 5: ChatCompletionClient](05_chatcompletionclient.md), we learned how agents talk to Large Language Models (LLMs) using a `ChatCompletionClient`. We saw that we need to send a list of `messages` (the conversation history) to the LLM so it knows the context.
But conversations can get very long! Imagine talking on the phone for an hour. Can you remember *every single word* that was said? Probably not. You remember the main points, the beginning, and what was said most recently. LLMs have a similar limitation they can only pay attention to a certain amount of text at once (called the "context window").
If we send the *entire* history of a very long chat, it might be too much for the LLM, lead to errors, be slow, or cost more money (since many LLMs charge based on the amount of text).
So, how do we smartly choose *which* parts of the conversation history to send? This is the problem that **`ChatCompletionContext`** solves.
## Motivation: Keeping LLM Conversations Focused
Let's say we have a helpful assistant agent chatting with a user:
1. **User:** "Hi! Can you tell me about AutoGen?"
2. **Assistant:** "Sure! AutoGen is a framework..." (provides details)
3. **User:** "Thanks! Now, can you draft an email to my team about our upcoming meeting?"
4. **Assistant:** "Okay, what's the meeting about?"
5. **User:** "It's about the project planning for Q3."
6. **Assistant:** (Needs to draft the email)
When the Assistant needs to draft the email (step 6), does it need the *exact* text from step 2 about what AutoGen is? Probably not. It definitely needs the instructions from step 3 and the topic from step 5. Maybe the initial greeting isn't super important either.
`ChatCompletionContext` acts like a **smart transcript editor**. Before sending the history to the LLM via the `ChatCompletionClient`, it reviews the full conversation log and prepares a shorter, focused version containing only the messages it thinks are most relevant for the LLM's next response.
## Key Concepts: Managing the Chat History
1. **The Full Transcript Holder:** A `ChatCompletionContext` object holds the *complete* list of messages (`LLMMessage` objects like `SystemMessage`, `UserMessage`, `AssistantMessage` from Chapter 5) that have occurred in a specific conversation thread. You add new messages using its `add_message` method.
2. **The Smart View Generator (`get_messages`):** The core job of `ChatCompletionContext` is done by its `get_messages` method. When called, it looks at the *full* transcript it holds, but returns only a *subset* of those messages based on its specific strategy. This subset is what you'll actually send to the `ChatCompletionClient`.
3. **Different Strategies for Remembering:** Because different situations require different focus, AutoGen Core provides several `ChatCompletionContext` implementations (strategies):
* **`UnboundedChatCompletionContext`:** The simplest (and sometimes riskiest!). It doesn't edit anything; `get_messages` just returns the *entire* history. Good for short chats, but can break with long ones.
* **`BufferedChatCompletionContext`:** Like remembering only the last few things someone said. It keeps the most recent `N` messages (where `N` is the `buffer_size` you set). Good for focusing on recent interactions.
* **`HeadAndTailChatCompletionContext`:** Tries to get the best of both worlds. It keeps the first few messages (the "head", maybe containing initial instructions) and the last few messages (the "tail", the recent context). It skips the messages in the middle.
## Use Case Example: Chatting with Different Memory Strategies
Let's simulate adding messages to different context managers and see what `get_messages` returns.
**Step 1: Define some messages**
```python
# File: define_chat_messages.py
from autogen_core.models import (
SystemMessage, UserMessage, AssistantMessage, LLMMessage
)
from typing import List
# The initial instruction for the assistant
system_msg = SystemMessage(content="You are a helpful assistant.")
# A sequence of user/assistant turns
chat_sequence: List[LLMMessage] = [
UserMessage(content="What is AutoGen?", source="User"),
AssistantMessage(content="AutoGen is a multi-agent framework...", source="Agent"),
UserMessage(content="What can it do?", source="User"),
AssistantMessage(content="It can build complex LLM apps.", source="Agent"),
UserMessage(content="Thanks!", source="User")
]
# Combine system message and the chat sequence
full_history: List[LLMMessage] = [system_msg] + chat_sequence
print(f"Total messages in full history: {len(full_history)}")
# Output: Total messages in full history: 6
```
We have a full history of 6 messages (1 system + 5 chat turns).
**Step 2: Use `UnboundedChatCompletionContext`**
This context keeps everything.
```python
# File: use_unbounded_context.py
import asyncio
from define_chat_messages import full_history
from autogen_core.model_context import UnboundedChatCompletionContext
async def main():
# Create context and add all messages
context = UnboundedChatCompletionContext()
for msg in full_history:
await context.add_message(msg)
# Get the messages to send to the LLM
messages_for_llm = await context.get_messages()
print(f"--- Unbounded Context ({len(messages_for_llm)} messages) ---")
for i, msg in enumerate(messages_for_llm):
print(f"{i+1}. [{msg.type}]: {msg.content[:30]}...")
# asyncio.run(main()) # If run
```
**Expected Output (Unbounded):**
```
--- Unbounded Context (6 messages) ---
1. [SystemMessage]: You are a helpful assistant....
2. [UserMessage]: What is AutoGen?...
3. [AssistantMessage]: AutoGen is a multi-agent fram...
4. [UserMessage]: What can it do?...
5. [AssistantMessage]: It can build complex LLM apps...
6. [UserMessage]: Thanks!...
```
It returns all 6 messages, exactly as added.
**Step 3: Use `BufferedChatCompletionContext`**
Let's keep only the last 3 messages.
```python
# File: use_buffered_context.py
import asyncio
from define_chat_messages import full_history
from autogen_core.model_context import BufferedChatCompletionContext
async def main():
# Keep only the last 3 messages
context = BufferedChatCompletionContext(buffer_size=3)
for msg in full_history:
await context.add_message(msg)
messages_for_llm = await context.get_messages()
print(f"--- Buffered Context (buffer=3, {len(messages_for_llm)} messages) ---")
for i, msg in enumerate(messages_for_llm):
print(f"{i+1}. [{msg.type}]: {msg.content[:30]}...")
# asyncio.run(main()) # If run
```
**Expected Output (Buffered):**
```
--- Buffered Context (buffer=3, 3 messages) ---
1. [UserMessage]: What can it do?...
2. [AssistantMessage]: It can build complex LLM apps...
3. [UserMessage]: Thanks!...
```
It only returns the last 3 messages from the full history. The system message and the first chat turn are omitted.
**Step 4: Use `HeadAndTailChatCompletionContext`**
Let's keep the first message (head=1) and the last two messages (tail=2).
```python
# File: use_head_tail_context.py
import asyncio
from define_chat_messages import full_history
from autogen_core.model_context import HeadAndTailChatCompletionContext
async def main():
# Keep first 1 and last 2 messages
context = HeadAndTailChatCompletionContext(head_size=1, tail_size=2)
for msg in full_history:
await context.add_message(msg)
messages_for_llm = await context.get_messages()
print(f"--- Head & Tail Context (h=1, t=2, {len(messages_for_llm)} messages) ---")
for i, msg in enumerate(messages_for_llm):
print(f"{i+1}. [{msg.type}]: {msg.content[:30]}...")
# asyncio.run(main()) # If run
```
**Expected Output (Head & Tail):**
```
--- Head & Tail Context (h=1, t=2, 4 messages) ---
1. [SystemMessage]: You are a helpful assistant....
2. [UserMessage]: Skipped 3 messages....
3. [AssistantMessage]: It can build complex LLM apps...
4. [UserMessage]: Thanks!...
```
It keeps the very first message (`SystemMessage`), then inserts a placeholder telling the LLM that some messages were skipped, and finally includes the last two messages. This preserves the initial instruction and the most recent context.
**Which one to choose?** It depends on your agent's task!
* Simple Q&A? `Buffered` might be fine.
* Following complex initial instructions? `HeadAndTail` or even `Unbounded` (if short) might be better.
## Under the Hood: How Context is Managed
The core idea is defined by the `ChatCompletionContext` abstract base class.
**Conceptual Flow:**
```mermaid
sequenceDiagram
participant Agent as Agent Logic
participant Context as ChatCompletionContext
participant FullHistory as Internal Message List
Agent->>+Context: add_message(newMessage)
Context->>+FullHistory: Append newMessage to list
FullHistory-->>-Context: List updated
Context-->>-Agent: Done
Agent->>+Context: get_messages()
Context->>+FullHistory: Read the full list
FullHistory-->>-Context: Return full list
Context->>Context: Apply Strategy (e.g., slice list for Buffered/HeadTail)
Context-->>-Agent: Return selected list of messages
```
1. **Adding:** When `add_message(message)` is called, the context simply appends the `message` to its internal list (`self._messages`).
2. **Getting:** When `get_messages()` is called:
* The context accesses its internal `self._messages` list.
* The specific implementation (`Unbounded`, `Buffered`, `HeadAndTail`) applies its logic to select which messages to return.
* It returns the selected list.
**Code Glimpse:**
* **Base Class (`_chat_completion_context.py`):** Defines the structure and common methods.
```python
# From: model_context/_chat_completion_context.py (Simplified)
from abc import ABC, abstractmethod
from typing import List
from ..models import LLMMessage
class ChatCompletionContext(ABC):
component_type = "chat_completion_context" # Identifies this as a component type
def __init__(self, initial_messages: List[LLMMessage] | None = None) -> None:
# Holds the COMPLETE history
self._messages: List[LLMMessage] = initial_messages or []
async def add_message(self, message: LLMMessage) -> None:
"""Add a message to the full context."""
self._messages.append(message)
@abstractmethod
async def get_messages(self) -> List[LLMMessage]:
"""Get the subset of messages based on the strategy."""
# Each subclass MUST implement this logic
...
# Other methods like clear(), save_state(), load_state() exist too
```
The base class handles storing messages; subclasses define *how* to retrieve them.
* **Unbounded (`_unbounded_chat_completion_context.py`):** The simplest implementation.
```python
# From: model_context/_unbounded_chat_completion_context.py (Simplified)
from typing import List
from ._chat_completion_context import ChatCompletionContext
from ..models import LLMMessage
class UnboundedChatCompletionContext(ChatCompletionContext):
async def get_messages(self) -> List[LLMMessage]:
"""Returns all messages."""
return self._messages # Just return the whole internal list
```
* **Buffered (`_buffered_chat_completion_context.py`):** Uses slicing to get the end of the list.
```python
# From: model_context/_buffered_chat_completion_context.py (Simplified)
from typing import List
from ._chat_completion_context import ChatCompletionContext
from ..models import LLMMessage, FunctionExecutionResultMessage
class BufferedChatCompletionContext(ChatCompletionContext):
def __init__(self, buffer_size: int, ...):
super().__init__(...)
self._buffer_size = buffer_size
async def get_messages(self) -> List[LLMMessage]:
"""Get at most `buffer_size` recent messages."""
# Slice the list to get the last 'buffer_size' items
messages = self._messages[-self._buffer_size :]
# Special case: Avoid starting with a function result message
if messages and isinstance(messages[0], FunctionExecutionResultMessage):
messages = messages[1:]
return messages
```
* **Head and Tail (`_head_and_tail_chat_completion_context.py`):** Combines slices from the beginning and end.
```python
# From: model_context/_head_and_tail_chat_completion_context.py (Simplified)
from typing import List
from ._chat_completion_context import ChatCompletionContext
from ..models import LLMMessage, UserMessage
class HeadAndTailChatCompletionContext(ChatCompletionContext):
def __init__(self, head_size: int, tail_size: int, ...):
super().__init__(...)
self._head_size = head_size
self._tail_size = tail_size
async def get_messages(self) -> List[LLMMessage]:
head = self._messages[: self._head_size] # First 'head_size' items
tail = self._messages[-self._tail_size :] # Last 'tail_size' items
num_skipped = len(self._messages) - len(head) - len(tail)
if num_skipped <= 0: # If no overlap or gap
return self._messages
else: # If messages were skipped
placeholder = [UserMessage(content=f"Skipped {num_skipped} messages.", source="System")]
# Combine head + placeholder + tail
return head + placeholder + tail
```
These implementations provide different ways to manage the context window effectively.
## Putting it Together with ChatCompletionClient
How does an agent use `ChatCompletionContext` with the `ChatCompletionClient` from Chapter 5?
1. An agent has an instance of a `ChatCompletionContext` (e.g., `BufferedChatCompletionContext`) to store its conversation history.
2. When the agent receives a new message (e.g., a `UserMessage`), it calls `await context.add_message(new_user_message)`.
3. To prepare for calling the LLM, the agent calls `messages_to_send = await context.get_messages()`. This gets the strategically selected subset of the history.
4. The agent then passes this list to the `ChatCompletionClient`: `response = await llm_client.create(messages=messages_to_send, ...)`.
5. When the LLM replies (e.g., with an `AssistantMessage`), the agent adds it back to the context: `await context.add_message(llm_response_message)`.
This loop ensures that the history is continuously updated and intelligently trimmed before each call to the LLM.
## Next Steps
You've learned how `ChatCompletionContext` helps manage the conversation history sent to LLMs, preventing context window overflows and keeping the interaction focused using different strategies (`Unbounded`, `Buffered`, `HeadAndTail`).
This context management is a specific form of **memory**. Agents might need to remember things beyond just the chat history. How do they store general information, state, or knowledge over time?
* [Chapter 7: Memory](07_memory.md): Explore the broader concept of Memory in AutoGen Core, which provides more general ways for agents to store and retrieve information.
* [Chapter 8: Component](08_component.md): Understand how `ChatCompletionContext` fits into the general `Component` model, allowing configuration and integration within the AutoGen system.
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,323 @@
# Chapter 7: Memory - The Agent's Notebook
In [Chapter 6: ChatCompletionContext](06_chatcompletioncontext.md), we saw how agents manage the *short-term* history of a single conversation before talking to an LLM. It's like remembering what was just said in the last few minutes.
But what if an agent needs to remember things for much longer, across *multiple* conversations or tasks? For example, imagine an assistant agent that learns your preferences:
* You tell it: "Please always write emails in a formal style for me."
* Weeks later, you ask it to draft a new email.
How does it remember that preference? The short-term `ChatCompletionContext` might have forgotten the earlier instruction, especially if using a strategy like `BufferedChatCompletionContext`. The agent needs a **long-term memory**.
This is where the **`Memory`** abstraction comes in. Think of it as the agent's **long-term notebook or database**. While `ChatCompletionContext` is the scratchpad for the current chat, `Memory` holds persistent information the agent can add to or look up later.
## Motivation: Remembering Across Conversations
Our goal is to give an agent the ability to store a piece of information (like a user preference) and retrieve it later to influence its behavior, even in a completely new conversation. `Memory` provides the mechanism for this long-term storage and retrieval.
## Key Concepts: How the Notebook Works
1. **What it Stores (`MemoryContent`):** Agents can store various types of information in their memory. This could be:
* Plain text notes (`text/plain`)
* Structured data like JSON (`application/json`)
* Even images (`image/*`)
Each piece of information is wrapped in a `MemoryContent` object, which includes the data itself, its type (`mime_type`), and optional descriptive `metadata`.
```python
# From: memory/_base_memory.py (Simplified Concept)
from pydantic import BaseModel
from typing import Any, Dict, Union
# Represents one entry in the memory notebook
class MemoryContent(BaseModel):
content: Union[str, bytes, Dict[str, Any]] # The actual data
mime_type: str # What kind of data (e.g., "text/plain")
metadata: Dict[str, Any] | None = None # Extra info (optional)
```
This standard format helps manage different kinds of memories.
2. **Adding to Memory (`add`):** When an agent learns something important it wants to remember long-term (like the user's preferred style), it uses the `memory.add(content)` method. This is like writing a new entry in the notebook.
3. **Querying Memory (`query`):** When an agent needs to recall information, it can use `memory.query(query_text)`. This is like searching the notebook for relevant entries. How the search works depends on the specific memory implementation (it could be a simple text match, or a sophisticated vector search in more advanced memories).
4. **Updating Chat Context (`update_context`):** This is a crucial link! Before an agent talks to the LLM (using the `ChatCompletionClient` from [Chapter 5](05_chatcompletionclient.md)), it can use `memory.update_context(chat_context)` method. This method:
* Looks at the current conversation (`chat_context`).
* Queries the long-term memory (`Memory`) for relevant information.
* Injects the retrieved memories *into* the `chat_context`, often as a `SystemMessage`.
This way, the LLM gets the benefit of the long-term memory *in addition* to the short-term conversation history, right before generating its response.
5. **Different Memory Implementations:** Just like there are different `ChatCompletionContext` strategies, there can be different `Memory` implementations:
* `ListMemory`: A very simple memory that stores everything in a Python list (like a simple chronological notebook).
* *Future Possibilities*: More advanced implementations could use databases or vector stores for more efficient storage and retrieval of vast amounts of information.
## Use Case Example: Remembering User Preferences with `ListMemory`
Let's implement our user preference use case using the simple `ListMemory`.
**Goal:**
1. Create a `ListMemory`.
2. Add a user preference ("formal style") to it.
3. Start a *new* chat context.
4. Use `update_context` to inject the preference into the new chat context.
5. Show how the chat context looks *before* being sent to the LLM.
**Step 1: Create the Memory**
We'll use `ListMemory`, the simplest implementation provided by AutoGen Core.
```python
# File: create_list_memory.py
from autogen_core.memory import ListMemory
# Create a simple list-based memory instance
user_prefs_memory = ListMemory(name="user_preferences")
print(f"Created memory: {user_prefs_memory.name}")
print(f"Initial content: {user_prefs_memory.content}")
# Output:
# Created memory: user_preferences
# Initial content: []
```
We have an empty memory notebook named "user_preferences".
**Step 2: Add the Preference**
Let's add the user's preference as a piece of text memory.
```python
# File: add_preference.py
import asyncio
from autogen_core.memory import MemoryContent
# Assume user_prefs_memory exists from the previous step
# Define the preference as MemoryContent
preference = MemoryContent(
content="User prefers all communication to be written in a formal style.",
mime_type="text/plain", # It's just text
metadata={"source": "user_instruction_conversation_1"} # Optional info
)
async def add_to_memory():
# Add the content to our memory instance
await user_prefs_memory.add(preference)
print(f"Memory content after adding: {user_prefs_memory.content}")
asyncio.run(add_to_memory())
# Output (will show the MemoryContent object):
# Memory content after adding: [MemoryContent(content='User prefers...', mime_type='text/plain', metadata={'source': '...'})]
```
We've successfully written the preference into our `ListMemory` notebook.
**Step 3: Start a New Chat Context**
Imagine time passes, and the user starts a new conversation asking for an email draft. We create a fresh `ChatCompletionContext`.
```python
# File: start_new_chat.py
from autogen_core.model_context import UnboundedChatCompletionContext
from autogen_core.models import UserMessage
# Start a new, empty chat context for a new task
new_chat_context = UnboundedChatCompletionContext()
# Add the user's new request
new_request = UserMessage(content="Draft an email to the team about the Q3 results.", source="User")
# await new_chat_context.add_message(new_request) # In a real app, add the request
print("Created a new, empty chat context.")
# Output: Created a new, empty chat context.
```
This context currently *doesn't* know about the "formal style" preference stored in our long-term memory.
**Step 4: Inject Memory into Chat Context**
Before sending the `new_chat_context` to the LLM, we use `update_context` to bring in relevant long-term memories.
```python
# File: update_chat_with_memory.py
import asyncio
# Assume user_prefs_memory exists (with the preference added)
# Assume new_chat_context exists (empty or with just the new request)
# Assume new_request exists
async def main():
# --- This is where Memory connects to Chat Context ---
print("Updating chat context with memory...")
update_result = await user_prefs_memory.update_context(new_chat_context)
print(f"Memories injected: {len(update_result.memories.results)}")
# Now let's add the actual user request for this task
await new_chat_context.add_message(new_request)
# See what messages are now in the context
messages_for_llm = await new_chat_context.get_messages()
print("\nMessages to be sent to LLM:")
for msg in messages_for_llm:
print(f"- [{msg.type}]: {msg.content}")
asyncio.run(main())
```
**Expected Output:**
```
Updating chat context with memory...
Memories injected: 1
Messages to be sent to LLM:
- [SystemMessage]:
Relevant memory content (in chronological order):
1. User prefers all communication to be written in a formal style.
- [UserMessage]: Draft an email to the team about the Q3 results.
```
Look! The `ListMemory.update_context` method automatically queried the memory (in this simple case, it just takes *all* entries) and added a `SystemMessage` to the `new_chat_context`. This message explicitly tells the LLM about the stored preference *before* it sees the user's request to draft the email.
**Step 5: (Conceptual) Sending to LLM**
Now, if we were to send `messages_for_llm` to the `ChatCompletionClient` (Chapter 5):
```python
# Conceptual code - Requires a configured client
# response = await llm_client.create(messages=messages_for_llm)
```
The LLM would receive both the instruction about the formal style preference (from Memory) and the request to draft the email. It's much more likely to follow the preference now!
**Step 6: Direct Query (Optional)**
We can also directly query the memory if needed, without involving a chat context.
```python
# File: query_memory.py
import asyncio
# Assume user_prefs_memory exists
async def main():
# Query the memory (ListMemory returns all items regardless of query text)
query_result = await user_prefs_memory.query("style preference")
print("\nDirect query result:")
for item in query_result.results:
print(f"- Content: {item.content}, Type: {item.mime_type}")
asyncio.run(main())
# Output:
# Direct query result:
# - Content: User prefers all communication to be written in a formal style., Type: text/plain
```
This shows how an agent could specifically look things up in its notebook.
## Under the Hood: How `ListMemory` Injects Context
Let's trace the `update_context` call for `ListMemory`.
**Conceptual Flow:**
```mermaid
sequenceDiagram
participant AgentLogic as Agent Logic
participant ListMem as ListMemory
participant InternalList as Memory's Internal List
participant ChatCtx as ChatCompletionContext
AgentLogic->>+ListMem: update_context(chat_context)
ListMem->>+InternalList: Get all stored MemoryContent items
InternalList-->>-ListMem: Return list of [pref_content]
alt Memory list is NOT empty
ListMem->>ListMem: Format memories into a single string (e.g., "1. pref_content")
ListMem->>ListMem: Create SystemMessage with formatted string
ListMem->>+ChatCtx: add_message(SystemMessage)
ChatCtx-->>-ListMem: Context updated
end
ListMem->>ListMem: Create UpdateContextResult(memories=[pref_content])
ListMem-->>-AgentLogic: Return UpdateContextResult
```
1. The agent calls `user_prefs_memory.update_context(new_chat_context)`.
2. The `ListMemory` instance accesses its internal `_contents` list.
3. It checks if the list is empty. If not:
4. It iterates through the `MemoryContent` items in the list.
5. It formats them into a numbered string (like "Relevant memory content...\n1. Item 1\n2. Item 2...").
6. It creates a single `SystemMessage` containing this formatted string.
7. It calls `new_chat_context.add_message()` to add this `SystemMessage` to the chat history that will be sent to the LLM.
8. It returns an `UpdateContextResult` containing the list of memories it just processed.
**Code Glimpse:**
* **`Memory` Protocol (`memory/_base_memory.py`):** Defines the required methods for any memory implementation.
```python
# From: memory/_base_memory.py (Simplified ABC)
from abc import ABC, abstractmethod
# ... other imports: MemoryContent, MemoryQueryResult, UpdateContextResult, ChatCompletionContext
class Memory(ABC):
component_type = "memory"
@abstractmethod
async def update_context(self, model_context: ChatCompletionContext) -> UpdateContextResult: ...
@abstractmethod
async def query(self, query: str | MemoryContent, ...) -> MemoryQueryResult: ...
@abstractmethod
async def add(self, content: MemoryContent, ...) -> None: ...
@abstractmethod
async def clear(self) -> None: ...
@abstractmethod
async def close(self) -> None: ...
```
Any class wanting to act as Memory must provide these methods.
* **`ListMemory` Implementation (`memory/_list_memory.py`):**
```python
# From: memory/_list_memory.py (Simplified)
from typing import List
# ... other imports: Memory, MemoryContent, ..., SystemMessage, ChatCompletionContext
class ListMemory(Memory):
def __init__(self, ..., memory_contents: List[MemoryContent] | None = None):
# Stores memory items in a simple list
self._contents: List[MemoryContent] = memory_contents or []
async def add(self, content: MemoryContent, ...) -> None:
"""Add new content to the internal list."""
self._contents.append(content)
async def query(self, query: str | MemoryContent = "", ...) -> MemoryQueryResult:
"""Return all memories, ignoring the query."""
# Simple implementation: just return everything
return MemoryQueryResult(results=self._contents)
async def update_context(self, model_context: ChatCompletionContext) -> UpdateContextResult:
"""Add all memories as a SystemMessage to the chat context."""
if not self._contents: # Do nothing if memory is empty
return UpdateContextResult(memories=MemoryQueryResult(results=[]))
# Format all memories into a numbered list string
memory_strings = [f"{i}. {str(mem.content)}" for i, mem in enumerate(self._contents, 1)]
memory_context_str = "Relevant memory content...\n" + "\n".join(memory_strings) + "\n"
# Add this string as a SystemMessage to the provided chat context
await model_context.add_message(SystemMessage(content=memory_context_str))
# Return info about which memories were added
return UpdateContextResult(memories=MemoryQueryResult(results=self._contents))
# ... clear(), close(), config methods ...
```
This shows the straightforward logic of `ListMemory`: store in a list, retrieve the whole list, and inject the whole list as a single system message into the chat context. More complex memories might use smarter retrieval (e.g., based on the `query` in `query()` or the last message in `update_context`) and inject memories differently.
## Next Steps
You've learned about `Memory`, AutoGen Core's mechanism for giving agents long-term recall beyond the immediate conversation (`ChatCompletionContext`). We saw how `MemoryContent` holds information, `add` stores it, `query` retrieves it, and `update_context` injects relevant memories into the LLM's working context. We explored the simple `ListMemory` as a basic example.
Memory systems are crucial for agents that learn, adapt, or need to maintain state across interactions.
This concludes our deep dive into the core abstractions of AutoGen Core! We've covered Agents, Messaging, Runtime, Tools, LLM Clients, Chat Context, and now Memory. There's one final concept that ties many of these together from a configuration perspective:
* [Chapter 8: Component](08_component.md): Understand the general `Component` model in AutoGen Core, how it allows pieces like `Memory`, `ChatCompletionContext`, and `ChatCompletionClient` to be configured and managed consistently.
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,359 @@
# Chapter 8: Component - The Standardized Building Blocks
Welcome to Chapter 8! In our journey so far, we've met several key players in AutoGen Core:
* [Agents](01_agent.md): The workers.
* [Messaging System](02_messaging_system__topic___subscription_.md): How they communicate.
* [AgentRuntime](03_agentruntime.md): The manager.
* [Tools](04_tool.md): Their special skills.
* [ChatCompletionClient](05_chatcompletionclient.md): How they talk to LLMs.
* [ChatCompletionContext](06_chatcompletioncontext.md): How they remember recent chat history.
* [Memory](07_memory.md): How they remember things long-term.
Now, imagine you've built a fantastic agent system using these parts. You've configured a specific `ChatCompletionClient` to use OpenAI's `gpt-4o` model, and you've set up a `ListMemory` (from Chapter 7) to store user preferences. How do you save this exact setup so you can easily recreate it later, or share it with a friend? And what if you later want to swap out the `gpt-4o` client for a different one, like Anthropic's Claude, without rewriting your agent's core logic?
This is where the **`Component`** concept comes in. It provides a standard way to define, configure, save, and load these reusable building blocks.
## Motivation: Making Setups Portable and Swappable
Think of the parts we've used so far `ChatCompletionClient`, `Memory`, `Tool` like specialized **Lego bricks**. Each brick has a specific function (connecting to an LLM, remembering things, performing an action).
Wouldn't it be great if:
1. Each Lego brick had a standard way to describe its properties (like "Red 2x4 Brick")?
2. You could easily save the description of all the bricks used in your creation (your agent system)?
3. Someone else could take that description and automatically rebuild your exact creation?
4. You could easily swap a "Red 2x4 Brick" for a "Blue 2x4 Brick" without having to rebuild everything around it?
The `Component` abstraction in AutoGen Core provides exactly this! It makes your building blocks **configurable**, **savable**, **loadable**, and **swappable**.
## Key Concepts: Understanding Components
Let's break down what makes the Component system work:
1. **Component:** A class (like `ListMemory` or `OpenAIChatCompletionClient`) that is designed to be a standard, reusable building block. It performs a specific role within the AutoGen ecosystem. Many core classes inherit from `Component` or related base classes.
2. **Configuration (`Config`):** Every Component has specific settings. For example, an `OpenAIChatCompletionClient` needs an API key and a model name. A `ListMemory` might have a name. These settings are defined in a standard way, usually using a Pydantic `BaseModel` specific to that component type. This `Config` acts like the "specification sheet" for the component instance.
3. **Saving Settings (`_to_config` method):** A Component instance knows how to generate its *current* configuration. It has an internal method, `_to_config()`, that returns a `Config` object representing its settings. This is like asking a configured Lego brick, "What color and size are you?"
4. **Loading Settings (`_from_config` class method):** A Component *class* knows how to create a *new* instance of itself from a given configuration. It has a class method, `_from_config(config)`, that takes a `Config` object and builds a new, configured component instance. This is like having instructions: "Build a brick with this color and size."
5. **`ComponentModel` (The Box):** This is the standard package format used to save and load components. It's like the label and instructions on the Lego box. A `ComponentModel` contains:
* `provider`: A string telling AutoGen *which* Python class to use (e.g., `"autogen_core.memory.ListMemory"`).
* `config`: A dictionary holding the specific settings for this instance (the output of `_to_config()`).
* `component_type`: The general role of the component (e.g., `"memory"`, `"model"`, `"tool"`).
* Other metadata like `version`, `description`, `label`.
```python
# From: _component_config.py (Conceptual Structure)
from pydantic import BaseModel
from typing import Dict, Any
class ComponentModel(BaseModel):
provider: str # Path to the class (e.g., "autogen_core.memory.ListMemory")
config: Dict[str, Any] # The specific settings for this instance
component_type: str | None = None # Role (e.g., "memory")
# ... other fields like version, description, label ...
```
This `ComponentModel` is what you typically save to a file (often as JSON or YAML).
## Use Case Example: Saving and Loading `ListMemory`
Let's see how this works with the `ListMemory` we used in [Chapter 7: Memory](07_memory.md).
**Goal:**
1. Create a `ListMemory` instance.
2. Save its configuration using the Component system (`dump_component`).
3. Load that configuration to create a *new*, identical `ListMemory` instance (`load_component`).
**Step 1: Create and Configure a `ListMemory`**
First, let's make a memory component. `ListMemory` is already designed as a Component.
```python
# File: create_memory_component.py
import asyncio
from autogen_core.memory import ListMemory, MemoryContent
# Create an instance of ListMemory
my_memory = ListMemory(name="user_prefs_v1")
# Add some content (from Chapter 7 example)
async def add_content():
pref = MemoryContent(content="Use formal style", mime_type="text/plain")
await my_memory.add(pref)
print(f"Created memory '{my_memory.name}' with content: {my_memory.content}")
asyncio.run(add_content())
# Output: Created memory 'user_prefs_v1' with content: [MemoryContent(content='Use formal style', mime_type='text/plain', metadata=None)]
```
We have our configured `my_memory` instance.
**Step 2: Save the Configuration (`dump_component`)**
Now, let's ask this component instance to describe itself by creating a `ComponentModel`.
```python
# File: save_memory_config.py
# Assume 'my_memory' exists from the previous step
# Dump the component's configuration into a ComponentModel
memory_model = my_memory.dump_component()
# Let's print it (converting to dict for readability)
print("Saved ComponentModel:")
print(memory_model.model_dump_json(indent=2))
```
**Expected Output:**
```json
Saved ComponentModel:
{
"provider": "autogen_core.memory.ListMemory",
"component_type": "memory",
"version": 1,
"component_version": 1,
"description": "ListMemory stores memory content in a simple list.",
"label": "ListMemory",
"config": {
"name": "user_prefs_v1",
"memory_contents": [
{
"content": "Use formal style",
"mime_type": "text/plain",
"metadata": null
}
]
}
}
```
Look at the output! `dump_component` created a `ComponentModel` that contains:
* `provider`: Exactly which class to use (`autogen_core.memory.ListMemory`).
* `config`: The specific settings, including the `name` and even the `memory_contents` we added!
* `component_type`: Its role is `"memory"`.
* Other useful info like description and version.
You could save this JSON structure to a file (`my_memory_config.json`).
**Step 3: Load the Configuration (`load_component`)**
Now, imagine you're starting a new script or sharing the config file. You can load this `ComponentModel` to recreate the memory instance.
```python
# File: load_memory_config.py
from autogen_core import ComponentModel
from autogen_core.memory import ListMemory # Need the class for type hint/loading
# Assume 'memory_model' is the ComponentModel we just created
# (or loaded from a file)
print(f"Loading component from ComponentModel (Provider: {memory_model.provider})...")
# Use the ComponentLoader mechanism (available on Component classes)
# to load the model. We specify the expected type (ListMemory).
loaded_memory: ListMemory = ListMemory.load_component(memory_model)
print(f"Successfully loaded memory!")
print(f"- Name: {loaded_memory.name}")
print(f"- Content: {loaded_memory.content}")
```
**Expected Output:**
```
Loading component from ComponentModel (Provider: autogen_core.memory.ListMemory)...
Successfully loaded memory!
- Name: user_prefs_v1
- Content: [MemoryContent(content='Use formal style', mime_type='text/plain', metadata=None)]
```
Success! `load_component` read the `ComponentModel`, found the right class (`ListMemory`), used its `_from_config` method with the saved `config` data, and created a brand new `loaded_memory` instance that is identical to our original `my_memory`.
**Benefits Shown:**
* **Reproducibility:** We saved the exact state (including content!) and loaded it perfectly.
* **Configuration:** We could easily save this to a JSON/YAML file and manage it outside our Python code.
* **Modularity (Conceptual):** If `ListMemory` and `VectorDBMemory` were both Components of type "memory", we could potentially load either one from a configuration file just by changing the `provider` and `config` in the file, without altering the agent code that *uses* the memory component (assuming the agent interacts via the standard `Memory` interface from Chapter 7).
## Under the Hood: How Saving and Loading Work
Let's peek behind the curtain.
**Saving (`dump_component`) Flow:**
```mermaid
sequenceDiagram
participant User
participant MyMemory as my_memory (ListMemory instance)
participant ListMemConfig as ListMemoryConfig (Pydantic Model)
participant CompModel as ComponentModel
User->>+MyMemory: dump_component()
MyMemory->>MyMemory: Calls internal self._to_config()
MyMemory->>+ListMemConfig: Creates Config object (name="...", contents=[...])
ListMemConfig-->>-MyMemory: Returns Config object
MyMemory->>MyMemory: Gets provider string ("autogen_core.memory.ListMemory")
MyMemory->>MyMemory: Gets component_type ("memory"), version, etc.
MyMemory->>+CompModel: Creates ComponentModel(provider=..., config=config_dict, ...)
CompModel-->>-MyMemory: Returns ComponentModel instance
MyMemory-->>-User: Returns ComponentModel instance
```
1. You call `my_memory.dump_component()`.
2. It calls its own `_to_config()` method. For `ListMemory`, this gathers the `name` and current `_contents`.
3. `_to_config()` returns a `ListMemoryConfig` object (a Pydantic model) holding these values.
4. `dump_component()` takes this `ListMemoryConfig` object, converts its data into a dictionary (`config` field).
5. It figures out its own class path (`provider`) and other metadata (`component_type`, `version`, etc.).
6. It packages all this into a `ComponentModel` object and returns it.
**Loading (`load_component`) Flow:**
```mermaid
sequenceDiagram
participant User
participant Loader as ComponentLoader (e.g., ListMemory.load_component)
participant Importer as Python Import System
participant ListMemClass as ListMemory (Class definition)
participant ListMemConfig as ListMemoryConfig (Pydantic Model)
participant NewMemory as New ListMemory Instance
User->>+Loader: load_component(component_model)
Loader->>Loader: Reads provider ("autogen_core.memory.ListMemory") from model
Loader->>+Importer: Imports the class `autogen_core.memory.ListMemory`
Importer-->>-Loader: Returns ListMemory class object
Loader->>+ListMemClass: Checks if it's a valid Component class
Loader->>ListMemClass: Gets expected config schema (ListMemoryConfig)
Loader->>+ListMemConfig: Validates `config` dict from model against schema
ListMemConfig-->>-Loader: Returns validated ListMemoryConfig object
Loader->>+ListMemClass: Calls _from_config(validated_config)
ListMemClass->>+NewMemory: Creates new ListMemory instance using config
NewMemory-->>-ListMemClass: Returns new instance
ListMemClass-->>-Loader: Returns new instance
Loader-->>-User: Returns the new ListMemory instance
```
1. You call `ListMemory.load_component(memory_model)`.
2. The loader reads the `provider` string from `memory_model`.
3. It dynamically imports the class specified by `provider`.
4. It verifies this class is a proper `Component` subclass.
5. It finds the configuration schema defined by the class (e.g., `ListMemoryConfig`).
6. It validates the `config` dictionary from `memory_model` using this schema.
7. It calls the class's `_from_config()` method, passing the validated configuration object.
8. `_from_config()` uses the configuration data to initialize and return a new instance of the class (e.g., a new `ListMemory` with the loaded name and content).
9. The loader returns this newly created instance.
**Code Glimpse:**
The core logic lives in `_component_config.py`.
* **`Component` Base Class:** Classes like `ListMemory` inherit from `Component`. This requires them to define `component_type`, `component_config_schema`, and implement `_to_config()` and `_from_config()`.
```python
# From: _component_config.py (Simplified Concept)
from pydantic import BaseModel
from typing import Type, TypeVar, Generic, ClassVar
# ... other imports
ConfigT = TypeVar("ConfigT", bound=BaseModel)
class Component(Generic[ConfigT]): # Generic over its config type
# Required Class Variables for Concrete Components
component_type: ClassVar[str]
component_config_schema: Type[ConfigT]
# Required Instance Method for Saving
def _to_config(self) -> ConfigT:
raise NotImplementedError
# Required Class Method for Loading
@classmethod
def _from_config(cls, config: ConfigT) -> Self:
raise NotImplementedError
# dump_component and load_component are also part of the system
# (often inherited from base classes like ComponentBase)
def dump_component(self) -> ComponentModel: ...
@classmethod
def load_component(cls, model: ComponentModel | Dict[str, Any]) -> Self: ...
```
* **`ComponentModel`:** As shown before, a Pydantic model to hold the `provider`, `config`, `type`, etc.
* **`dump_component` Implementation (Conceptual):**
```python
# Inside ComponentBase or similar
def dump_component(self) -> ComponentModel:
# 1. Get the specific config from the instance
obj_config: BaseModel = self._to_config()
config_dict = obj_config.model_dump() # Convert to dictionary
# 2. Determine the provider string (class path)
provider_str = _type_to_provider_str(self.__class__)
# (Handle overrides like self.component_provider_override)
# 3. Get other metadata
comp_type = self.component_type
comp_version = self.component_version
# ... description, label ...
# 4. Create and return the ComponentModel
model = ComponentModel(
provider=provider_str,
config=config_dict,
component_type=comp_type,
version=comp_version,
# ... other metadata ...
)
return model
```
* **`load_component` Implementation (Conceptual):**
```python
# Inside ComponentLoader or similar
@classmethod
def load_component(cls, model: ComponentModel | Dict[str, Any]) -> Self:
# 1. Ensure we have a ComponentModel object
if isinstance(model, dict):
loaded_model = ComponentModel(**model)
else:
loaded_model = model
# 2. Import the class based on the provider string
provider_str = loaded_model.provider
# ... (handle WELL_KNOWN_PROVIDERS mapping) ...
module_path, class_name = provider_str.rsplit(".", 1)
module = importlib.import_module(module_path)
component_class = getattr(module, class_name)
# 3. Validate the class and config
if not is_component_class(component_class): # Check it's a valid Component
raise TypeError(...)
schema = component_class.component_config_schema
validated_config = schema.model_validate(loaded_model.config)
# 4. Call the class's factory method to create instance
instance = component_class._from_config(validated_config)
# 5. Return the instance (after type checks)
return instance
```
This system provides a powerful and consistent way to manage the building blocks of your AutoGen applications.
## Wrapping Up
Congratulations! You've reached the end of our core concepts tour. You now understand the `Component` model AutoGen Core's standard way to define configurable, savable, and loadable building blocks like `Memory`, `ChatCompletionClient`, `Tool`, and even aspects of `Agents` themselves.
* **Components** are like standardized Lego bricks.
* They use **`_to_config`** to describe their settings.
* They use **`_from_config`** to be built from settings.
* **`ComponentModel`** is the standard "box" storing the provider and config, enabling saving/loading (often via JSON/YAML).
This promotes:
* **Modularity:** Easily swap implementations (e.g., different LLM clients).
* **Reproducibility:** Save and load exact agent system configurations.
* **Configuration:** Manage settings in external files.
With these eight core concepts (`Agent`, `Messaging`, `AgentRuntime`, `Tool`, `ChatCompletionClient`, `ChatCompletionContext`, `Memory`, and `Component`), you have a solid foundation for understanding and building powerful multi-agent applications with AutoGen Core!
Happy building!
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)

View File

@@ -0,0 +1,47 @@
# Tutorial: AutoGen Core
AutoGen Core helps you build applications with multiple **_Agents_** that can work together.
Think of it like creating a team of specialized workers (*Agents*) who can communicate and use tools to solve problems.
The **_AgentRuntime_** acts as the manager, handling messages and agent lifecycles.
Agents communicate using a **_Messaging System_** (Topics and Subscriptions), can use **_Tools_** for specific tasks, interact with language models via a **_ChatCompletionClient_** while managing conversation history with **_ChatCompletionContext_**, and remember information using **_Memory_**.
**_Components_** provide a standard way to define and configure these building blocks.
**Source Repository:** [https://github.com/microsoft/autogen/tree/e45a15766746d95f8cfaaa705b0371267bec812e/python/packages/autogen-core/src/autogen_core](https://github.com/microsoft/autogen/tree/e45a15766746d95f8cfaaa705b0371267bec812e/python/packages/autogen-core/src/autogen_core)
```mermaid
flowchart TD
A0["0: Agent"]
A1["1: AgentRuntime"]
A2["2: Messaging System (Topic & Subscription)"]
A3["3: Component"]
A4["4: Tool"]
A5["5: ChatCompletionClient"]
A6["6: ChatCompletionContext"]
A7["7: Memory"]
A1 -- "Manages lifecycle" --> A0
A1 -- "Uses for message routing" --> A2
A0 -- "Uses LLM client" --> A5
A0 -- "Executes tools" --> A4
A0 -- "Accesses memory" --> A7
A5 -- "Gets history from" --> A6
A5 -- "Uses tool schema" --> A4
A7 -- "Updates LLM context" --> A6
A4 -- "Implemented as" --> A3
```
## Chapters
1. [Agent](01_agent.md)
2. [Messaging System (Topic & Subscription)](02_messaging_system__topic___subscription_.md)
3. [AgentRuntime](03_agentruntime.md)
4. [Tool](04_tool.md)
5. [ChatCompletionClient](05_chatcompletionclient.md)
6. [ChatCompletionContext](06_chatcompletioncontext.md)
7. [Memory](07_memory.md)
8. [Component](08_component.md)
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)