diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..db84ce2
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2025 Zachary Huang
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index 02af832..ff472c8 100644
--- a/README.md
+++ b/README.md
@@ -1,20 +1,23 @@
-
+Ever stared at a new codebase written by others feeling completely lost? This tutorial shows you how to build an AI agent that analyzes GitHub repositories and creates beginner-friendly tutorials explaining exactly how the code works.
+
-
-This is a project template for Agentic Coding with [Pocket Flow](https://github.com/The-Pocket/PocketFlow), a 100-line LLM framework, and Cursor.
+This project crawls GitHub repositories and build a knowledge base from the code:
-- We have included the [.cursorrules](.cursorrules) file to let Cursor AI help you build LLM projects.
-
-- Want to learn how to build LLM projects with Agentic Coding?
-
- - Check out the [Agentic Coding Guidance](https://the-pocket.github.io/PocketFlow/guide.html)
-
- - Check out the [YouTube Tutorial](https://www.youtube.com/@ZacharyLLM?sub_confirmation=1)
+- **Analyze entire codebases** to identify core abstractions and how they interact
+- **Transform complex code** into beginner-friendly tutorials with clear visualizations
+- **Build understanding systematically** from fundamentals to advanced concepts in logical steps
+
+Built with [Pocket Flow](https://github.com/The-Pocket/PocketFlow), a 100-line LLM framework.
\ No newline at end of file
diff --git a/assets/banner.png b/assets/banner.png
index 09f0f04..3baec68 100644
Binary files a/assets/banner.png and b/assets/banner.png differ
diff --git a/output/AutoGen Core/01_agent.md b/output/AutoGen Core/01_agent.md
new file mode 100644
index 0000000..b4c730f
--- /dev/null
+++ b/output/AutoGen Core/01_agent.md
@@ -0,0 +1,281 @@
+# Chapter 1: Agent - The Workers of AutoGen
+
+Welcome to the AutoGen Core tutorial! We're excited to guide you through building powerful applications with autonomous agents.
+
+## Motivation: Why Do We Need Agents?
+
+Imagine you want to build an automated system to write blog posts. You might need one part of the system to research a topic and another part to write the actual post based on the research. How do you represent these different "workers" and make them talk to each other?
+
+This is where the concept of an **Agent** comes in. In AutoGen Core, an `Agent` is the fundamental building block representing an actor or worker in your system. Think of it like an employee in an office.
+
+## Key Concepts: Understanding Agents
+
+Let's break down what makes an Agent:
+
+1. **It's a Worker:** An Agent is designed to *do* things. This could be running calculations, calling a Large Language Model (LLM) like ChatGPT, using a tool (like a search engine), or managing a piece of data.
+2. **It Has an Identity (`AgentId`):** Just like every employee has a name and a job title, every Agent needs a unique identity. This identity, called `AgentId`, has two parts:
+ * `type`: What kind of role does the agent have? (e.g., "researcher", "writer", "coder"). This helps organize agents.
+ * `key`: A unique name for this specific agent instance (e.g., "researcher-01", "amy-the-writer").
+
+ ```python
+ # From: _agent_id.py
+ class AgentId:
+ def __init__(self, type: str, key: str) -> None:
+ # ... (validation checks omitted for brevity)
+ self._type = type
+ self._key = key
+
+ @property
+ def type(self) -> str:
+ return self._type
+
+ @property
+ def key(self) -> str:
+ return self._key
+
+ def __str__(self) -> str:
+ # Creates an id like "researcher/amy-the-writer"
+ return f"{self._type}/{self._key}"
+ ```
+ This `AgentId` acts like the agent's address, allowing other agents (or the system) to send messages specifically to it.
+
+3. **It Has Metadata (`AgentMetadata`):** Besides its core identity, an agent often has descriptive information.
+ * `type`: Same as in `AgentId`.
+ * `key`: Same as in `AgentId`.
+ * `description`: A human-readable explanation of what the agent does (e.g., "Researches topics using web search").
+
+ ```python
+ # From: _agent_metadata.py
+ from typing import TypedDict
+
+ class AgentMetadata(TypedDict):
+ type: str
+ key: str
+ description: str
+ ```
+ This metadata helps understand the agent's purpose within the system.
+
+4. **It Communicates via Messages:** Agents don't work in isolation. They collaborate by sending and receiving messages. The primary way an agent receives work is through its `on_message` method. Think of this like the agent's inbox.
+
+ ```python
+ # From: _agent.py (Simplified Agent Protocol)
+ from typing import Any, Mapping, Protocol
+ # ... other imports
+
+ class Agent(Protocol):
+ @property
+ def id(self) -> AgentId: ... # The agent's unique ID
+
+ async def on_message(self, message: Any, ctx: MessageContext) -> Any:
+ """Handles an incoming message."""
+ # Agent's logic to process the message goes here
+ ...
+ ```
+ When an agent receives a message, `on_message` is called. The `message` contains the data or task, and `ctx` (MessageContext) provides extra information about the message (like who sent it). We'll cover `MessageContext` more later.
+
+5. **It Can Remember Things (State):** Sometimes, an agent needs to remember information between tasks, like keeping notes on research progress. Agents can optionally implement `save_state` and `load_state` methods to store and retrieve their internal memory.
+
+ ```python
+ # From: _agent.py (Simplified Agent Protocol)
+ class Agent(Protocol):
+ # ... other methods
+
+ async def save_state(self) -> Mapping[str, Any]:
+ """Save the agent's internal memory."""
+ # Return a dictionary representing the state
+ ...
+
+ async def load_state(self, state: Mapping[str, Any]) -> None:
+ """Load the agent's internal memory."""
+ # Restore state from the dictionary
+ ...
+ ```
+ We'll explore state and memory in more detail in [Chapter 7: Memory](07_memory.md).
+
+6. **Different Agent Types:** AutoGen Core provides base classes to make creating agents easier:
+ * `BaseAgent`: The fundamental class most agents inherit from. It provides common setup.
+ * `ClosureAgent`: A very quick way to create simple agents using just a function (like hiring a temp worker for a specific task defined on the spot).
+ * `RoutedAgent`: An agent that can automatically direct different types of messages to different internal handler methods (like a smart receptionist).
+
+## Use Case Example: Researcher and Writer
+
+Let's revisit our blog post example. We want a `Researcher` agent and a `Writer` agent.
+
+**Goal:**
+1. Tell the `Researcher` a topic (e.g., "AutoGen Agents").
+2. The `Researcher` finds some facts (we'll keep it simple and just make them up for now).
+3. The `Researcher` sends these facts to the `Writer`.
+4. The `Writer` receives the facts and drafts a short post.
+
+**Simplified Implementation Idea (using `ClosureAgent` for brevity):**
+
+First, let's define the messages they might exchange:
+
+```python
+from dataclasses import dataclass
+
+@dataclass
+class ResearchTopic:
+ topic: str
+
+@dataclass
+class ResearchFacts:
+ topic: str
+ facts: list[str]
+
+@dataclass
+class DraftPost:
+ topic: str
+ draft: str
+```
+These are simple Python classes to hold the data being passed around.
+
+Now, let's imagine defining the `Researcher` using a `ClosureAgent`. This agent will listen for `ResearchTopic` messages.
+
+```python
+# Simplified concept - requires AgentRuntime (Chapter 3) to actually run
+
+async def researcher_logic(agent_context, message: ResearchTopic, msg_context):
+ print(f"Researcher received topic: {message.topic}")
+ # In a real scenario, this would involve searching, calling an LLM, etc.
+ # For now, we just make up facts.
+ facts = [f"Fact 1 about {message.topic}", f"Fact 2 about {message.topic}"]
+ print(f"Researcher found facts: {facts}")
+
+ # Find the Writer agent's ID (we assume we know it)
+ writer_id = AgentId(type="writer", key="blog_writer_1")
+
+ # Send the facts to the Writer
+ await agent_context.send_message(
+ message=ResearchFacts(topic=message.topic, facts=facts),
+ recipient=writer_id,
+ )
+ print("Researcher sent facts to Writer.")
+ # This agent doesn't return a direct reply
+ return None
+```
+This `researcher_logic` function defines *what* the researcher does when it gets a `ResearchTopic` message. It processes the topic, creates `ResearchFacts`, and uses `agent_context.send_message` to send them to the `writer` agent.
+
+Similarly, the `Writer` agent would have its own logic:
+
+```python
+# Simplified concept - requires AgentRuntime (Chapter 3) to actually run
+
+async def writer_logic(agent_context, message: ResearchFacts, msg_context):
+ print(f"Writer received facts for topic: {message.topic}")
+ # In a real scenario, this would involve LLM prompting
+ draft = f"Blog Post about {message.topic}:\n"
+ for fact in message.facts:
+ draft += f"- {fact}\n"
+ print(f"Writer drafted post:\n{draft}")
+
+ # Perhaps save the draft or send it somewhere else
+ # For now, we just print it. We don't send another message.
+ return None # Or maybe return a confirmation/result
+```
+This `writer_logic` function defines how the writer reacts to receiving `ResearchFacts`.
+
+**Important:** To actually *run* these agents and make them communicate, we need the `AgentRuntime` (covered in [Chapter 3: AgentRuntime](03_agentruntime.md)) and the `Messaging System` (covered in [Chapter 2: Messaging System](02_messaging_system__topic___subscription_.md)). For now, focus on the *idea* that Agents are distinct workers defined by their logic (`on_message`) and identified by their `AgentId`.
+
+## Under the Hood: How an Agent Gets a Message
+
+While the full message delivery involves the `Messaging System` and `AgentRuntime`, let's look at the agent's role when it receives a message.
+
+**Conceptual Flow:**
+
+```mermaid
+sequenceDiagram
+ participant Sender as Sender Agent
+ participant Runtime as AgentRuntime
+ participant Recipient as Recipient Agent
+
+ Sender->>+Runtime: send_message(message, recipient_id)
+ Runtime->>+Recipient: Locate agent by recipient_id
+ Runtime->>+Recipient: on_message(message, context)
+ Recipient->>Recipient: Process message using internal logic
+ alt Response Needed
+ Recipient->>-Runtime: Return response value
+ Runtime->>-Sender: Deliver response value
+ else No Response
+ Recipient->>-Runtime: Return None (or no return)
+ end
+```
+
+1. Some other agent (Sender) or the system decides to send a message to our agent (Recipient).
+2. It tells the `AgentRuntime` (the manager): "Deliver this `message` to the agent with `recipient_id`".
+3. The `AgentRuntime` finds the correct `Recipient` agent instance.
+4. The `AgentRuntime` calls the `Recipient.on_message(message, context)` method.
+5. The agent's internal logic inside `on_message` (or methods called by it, like in `RoutedAgent`) runs to process the message.
+6. If the message requires a direct response (like an RPC call), the agent returns a value from `on_message`. If not (like a general notification or event), it might return `None`.
+
+**Code Glimpse:**
+
+The core definition is the `Agent` Protocol (`_agent.py`). It's like an interface or a contract โ any class wanting to be an Agent *must* provide these methods.
+
+```python
+# From: _agent.py - The Agent blueprint (Protocol)
+
+@runtime_checkable
+class Agent(Protocol):
+ @property
+ def metadata(self) -> AgentMetadata: ...
+
+ @property
+ def id(self) -> AgentId: ...
+
+ async def on_message(self, message: Any, ctx: MessageContext) -> Any: ...
+
+ async def save_state(self) -> Mapping[str, Any]: ...
+
+ async def load_state(self, state: Mapping[str, Any]) -> None: ...
+
+ async def close(self) -> None: ...
+```
+
+Most agents you create will inherit from `BaseAgent` (`_base_agent.py`). It provides some standard setup:
+
+```python
+# From: _base_agent.py (Simplified)
+class BaseAgent(ABC, Agent):
+ def __init__(self, description: str) -> None:
+ # Gets runtime & id from a special context when created by the runtime
+ # Raises error if you try to create it directly!
+ self._runtime: AgentRuntime = AgentInstantiationContext.current_runtime()
+ self._id: AgentId = AgentInstantiationContext.current_agent_id()
+ self._description = description
+ # ...
+
+ # This is the final version called by the runtime
+ @final
+ async def on_message(self, message: Any, ctx: MessageContext) -> Any:
+ # It calls the implementation method you need to write
+ return await self.on_message_impl(message, ctx)
+
+ # You MUST implement this in your subclass
+ @abstractmethod
+ async def on_message_impl(self, message: Any, ctx: MessageContext) -> Any: ...
+
+ # Helper to send messages easily
+ async def send_message(self, message: Any, recipient: AgentId, ...) -> Any:
+ # It just asks the runtime to do the actual sending
+ return await self._runtime.send_message(
+ message, sender=self.id, recipient=recipient, ...
+ )
+ # ... other methods like publish_message, save_state, load_state
+```
+Notice how `BaseAgent` handles getting its `id` and `runtime` during creation and provides a convenient `send_message` method that uses the runtime. When inheriting from `BaseAgent`, you primarily focus on implementing the `on_message_impl` method to define your agent's unique behavior.
+
+## Next Steps
+
+You now understand the core concept of an `Agent` in AutoGen Core! It's the fundamental worker unit with an identity, the ability to process messages, and optionally maintain state.
+
+In the next chapters, we'll explore:
+
+* [Chapter 2: Messaging System](02_messaging_system__topic___subscription_.md): How messages actually travel between agents.
+* [Chapter 3: AgentRuntime](03_agentruntime.md): The manager responsible for creating, running, and connecting agents.
+
+Let's continue building your understanding!
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
diff --git a/output/AutoGen Core/02_messaging_system__topic___subscription_.md b/output/AutoGen Core/02_messaging_system__topic___subscription_.md
new file mode 100644
index 0000000..24e8e5d
--- /dev/null
+++ b/output/AutoGen Core/02_messaging_system__topic___subscription_.md
@@ -0,0 +1,267 @@
+# Chapter 2: Messaging System (Topic & Subscription)
+
+In [Chapter 1: Agent](01_agent.md), we learned about Agents as individual workers. But how do they coordinate when one agent doesn't know exactly *who* needs the information it produces? Imagine our Researcher finds some facts. Maybe the Writer needs them, but maybe a Fact-Checker agent or a Summary agent also needs them later. How can the Researcher just announce "Here are the facts!" without needing a specific mailing list?
+
+This is where the **Messaging System**, specifically **Topics** and **Subscriptions**, comes in. It allows agents to broadcast messages to anyone interested, like posting on a company announcement board.
+
+## Motivation: Broadcasting Information
+
+Let's refine our blog post example:
+
+1. The `Researcher` agent finds facts about "AutoGen Agents".
+2. Instead of sending *directly* to the `Writer`, the `Researcher` **publishes** these facts to a general "research-results" **Topic**.
+3. The `Writer` agent has previously told the system it's **subscribed** to the "research-results" Topic.
+4. The system sees the new message on the Topic and delivers it to the `Writer` (and any other subscribers).
+
+This way, the `Researcher` doesn't need to know who the `Writer` is, or even if a `Writer` exists! It just broadcasts the results. If we later add a `FactChecker` agent that also needs the results, it simply subscribes to the same Topic.
+
+## Key Concepts: Topics and Subscriptions
+
+Let's break down the components of this broadcasting system:
+
+1. **Topic (`TopicId`): The Announcement Board**
+ * A `TopicId` represents a specific channel or category for messages. Think of it like the name of an announcement board (e.g., "Project Updates", "General Announcements").
+ * It has two main parts:
+ * `type`: What *kind* of event or information is this? (e.g., "research.completed", "user.request"). This helps categorize messages.
+ * `source`: *Where* or *why* did this event originate? Often, this relates to the specific task or context (e.g., the specific blog post being researched like "autogen-agents-blog-post", or the team generating the event like "research-team").
+
+ ```python
+ # From: _topic.py (Simplified)
+ from dataclasses import dataclass
+
+ @dataclass(frozen=True) # Immutable: can't change after creation
+ class TopicId:
+ type: str
+ source: str
+
+ def __str__(self) -> str:
+ # Creates an id like "research.completed/autogen-agents-blog-post"
+ return f"{self.type}/{self.source}"
+ ```
+ This structure allows for flexible filtering. Agents might subscribe to all topics of a certain `type`, regardless of the `source`, or only to topics with a specific `source`.
+
+2. **Publishing: Posting the Announcement**
+ * When an agent has information to share broadly, it *publishes* a message to a specific `TopicId`.
+ * This is like pinning a note to the designated announcement board. The agent doesn't need to know who will read it.
+
+3. **Subscription (`Subscription`): Signing Up for Updates**
+ * A `Subscription` is how an agent declares its interest in certain `TopicId`s.
+ * It acts like a rule: "If a message is published to a Topic that matches *this pattern*, please deliver it to *this kind of agent*".
+ * The `Subscription` links a `TopicId` pattern (e.g., "all topics with type `research.completed`") to an `AgentId` (or a way to determine the `AgentId`).
+
+4. **Routing: Delivering the Mail**
+ * The `AgentRuntime` (the system manager we'll meet in [Chapter 3: AgentRuntime](03_agentruntime.md)) keeps track of all active `Subscription`s.
+ * When a message is published to a `TopicId`, the `AgentRuntime` checks which `Subscription`s match that `TopicId`.
+ * For each match, it uses the `Subscription`'s rule to figure out which specific `AgentId` should receive the message and delivers it.
+
+## Use Case Example: Researcher Publishes, Writer Subscribes
+
+Let's see how our Researcher and Writer can use this system.
+
+**Goal:** Researcher publishes facts to a topic, Writer receives them via subscription.
+
+**1. Define the Topic:**
+We need a `TopicId` for research results. Let's say the `type` is "research.facts.available" and the `source` identifies the specific research task (e.g., "blog-post-autogen").
+
+```python
+# From: _topic.py
+from autogen_core import TopicId
+
+# Define the topic for this specific research task
+research_topic_id = TopicId(type="research.facts.available", source="blog-post-autogen")
+
+print(f"Topic ID: {research_topic_id}")
+# Output: Topic ID: research.facts.available/blog-post-autogen
+```
+This defines the "announcement board" we'll use.
+
+**2. Researcher Publishes:**
+The `Researcher` agent, after finding facts, will use its `agent_context` (provided by the runtime) to publish the `ResearchFacts` message to this topic.
+
+```python
+# Simplified concept - Researcher agent logic
+# Assume 'agent_context' and 'message' (ResearchTopic) are provided
+
+# Define the facts message (from Chapter 1)
+@dataclass
+class ResearchFacts:
+ topic: str
+ facts: list[str]
+
+async def researcher_publish_logic(agent_context, message: ResearchTopic, msg_context):
+ print(f"Researcher working on: {message.topic}")
+ facts_data = ResearchFacts(
+ topic=message.topic,
+ facts=[f"Fact A about {message.topic}", f"Fact B about {message.topic}"]
+ )
+
+ # Define the specific topic for this task's results
+ results_topic = TopicId(type="research.facts.available", source=message.topic) # Use message topic as source
+
+ # Publish the facts to the topic
+ await agent_context.publish_message(message=facts_data, topic_id=results_topic)
+ print(f"Researcher published facts to topic: {results_topic}")
+ # No direct reply needed
+ return None
+```
+Notice the `agent_context.publish_message` call. The Researcher doesn't specify a recipient, only the topic.
+
+**3. Writer Subscribes:**
+The `Writer` agent needs to tell the system it's interested in messages on topics like "research.facts.available". We can use a predefined `Subscription` type called `TypeSubscription`. This subscription typically means: "I am interested in all topics with this *exact type*. When a message arrives, create/use an agent of *my type* whose `key` matches the topic's `source`."
+
+```python
+# From: _type_subscription.py (Simplified Concept)
+from autogen_core import TypeSubscription, BaseAgent
+
+class WriterAgent(BaseAgent):
+ # ... agent implementation ...
+ async def on_message_impl(self, message: ResearchFacts, ctx):
+ # This method gets called when a subscribed message arrives
+ print(f"Writer ({self.id}) received facts via subscription: {message.facts}")
+ # ... process facts and write draft ...
+
+# How the Writer subscribes (usually done during runtime setup - Chapter 3)
+# This tells the runtime: "Messages on topics with type 'research.facts.available'
+# should go to a 'writer' agent whose key matches the topic source."
+writer_subscription = TypeSubscription(
+ topic_type="research.facts.available",
+ agent_type="writer" # The type of agent that should handle this
+)
+
+print(f"Writer subscription created for topic type: {writer_subscription.topic_type}")
+# Output: Writer subscription created for topic type: research.facts.available
+```
+When the `Researcher` publishes to `TopicId(type="research.facts.available", source="blog-post-autogen")`, the `AgentRuntime` will see that `writer_subscription` matches the `topic_type`. It will then use the rule: "Find (or create) an agent with `AgentId(type='writer', key='blog-post-autogen')` and deliver the message."
+
+**Benefit:** Decoupling! The Researcher just broadcasts. The Writer just listens for relevant broadcasts. We can add more listeners (like a `FactChecker` subscribing to the same `topic_type`) without changing the `Researcher` at all.
+
+## Under the Hood: How Publishing Works
+
+Let's trace the journey of a published message.
+
+**Conceptual Flow:**
+
+```mermaid
+sequenceDiagram
+ participant Publisher as Publisher Agent
+ participant Runtime as AgentRuntime
+ participant SubRegistry as Subscription Registry
+ participant Subscriber as Subscriber Agent
+
+ Publisher->>+Runtime: publish_message(message, topic_id)
+ Runtime->>+SubRegistry: Find subscriptions matching topic_id
+ SubRegistry-->>-Runtime: Return list of matching Subscriptions
+ loop For each matching Subscription
+ Runtime->>Subscription: map_to_agent(topic_id)
+ Subscription-->>Runtime: Return target AgentId
+ Runtime->>+Subscriber: Locate/Create Agent instance by AgentId
+ Runtime->>Subscriber: on_message(message, context)
+ Subscriber-->>-Runtime: Process message (optional return)
+ end
+ Runtime-->>-Publisher: Return (usually None for publish)
+```
+
+1. **Publish:** An agent calls `agent_context.publish_message(message, topic_id)`. This internally calls the `AgentRuntime`'s publish method.
+2. **Lookup:** The `AgentRuntime` takes the `topic_id` and consults its internal `Subscription Registry`.
+3. **Match:** The Registry checks all registered `Subscription` objects. Each `Subscription` has an `is_match(topic_id)` method. The registry finds all subscriptions where `is_match` returns `True`.
+4. **Map:** For each matching `Subscription`, the Runtime calls its `map_to_agent(topic_id)` method. This method returns the specific `AgentId` that should handle this message based on the subscription rule and the topic details.
+5. **Deliver:** The `AgentRuntime` finds the agent instance corresponding to the returned `AgentId` (potentially creating it if it doesn't exist yet, especially with `TypeSubscription`). It then calls that agent's `on_message` method, delivering the original published `message`.
+
+**Code Glimpse:**
+
+* **`TopicId` (`_topic.py`):** As shown before, a simple dataclass holding `type` and `source`. It includes validation to ensure the `type` follows certain naming conventions.
+
+ ```python
+ # From: _topic.py
+ @dataclass(eq=True, frozen=True)
+ class TopicId:
+ type: str
+ source: str
+ # ... validation and __str__ ...
+
+ @classmethod
+ def from_str(cls, topic_id: str) -> Self:
+ # Helper to parse "type/source" string
+ # ... implementation ...
+ ```
+
+* **`Subscription` Protocol (`_subscription.py`):** This defines the *contract* for any subscription rule.
+
+ ```python
+ # From: _subscription.py (Simplified Protocol)
+ from typing import Protocol
+ # ... other imports
+
+ class Subscription(Protocol):
+ @property
+ def id(self) -> str: ... # Unique ID for this subscription instance
+
+ def is_match(self, topic_id: TopicId) -> bool:
+ """Check if a topic matches this subscription's rule."""
+ ...
+
+ def map_to_agent(self, topic_id: TopicId) -> AgentId:
+ """Determine the target AgentId if is_match was True."""
+ ...
+ ```
+ Any class implementing these methods can act as a subscription rule.
+
+* **`TypeSubscription` (`_type_subscription.py`):** A common implementation of the `Subscription` protocol.
+
+ ```python
+ # From: _type_subscription.py (Simplified)
+ class TypeSubscription(Subscription):
+ def __init__(self, topic_type: str, agent_type: str, ...):
+ self._topic_type = topic_type
+ self._agent_type = agent_type
+ # ... generates a unique self._id ...
+
+ def is_match(self, topic_id: TopicId) -> bool:
+ # Matches if the topic's type is exactly the one we want
+ return topic_id.type == self._topic_type
+
+ def map_to_agent(self, topic_id: TopicId) -> AgentId:
+ # Maps to an agent of the specified type, using the
+ # topic's source as the agent's unique key.
+ if not self.is_match(topic_id):
+ raise CantHandleException(...) # Should not happen if used correctly
+ return AgentId(type=self._agent_type, key=topic_id.source)
+ # ... id property ...
+ ```
+ This implementation provides the "one agent instance per source" behavior for a specific topic type.
+
+* **`DefaultSubscription` (`_default_subscription.py`):** This is often used via a decorator (`@default_subscription`) and provides a convenient way to create a `TypeSubscription` where the `agent_type` is automatically inferred from the agent class being defined, and the `topic_type` defaults to "default" (but can be overridden). It simplifies common use cases.
+
+ ```python
+ # From: _default_subscription.py (Conceptual Usage)
+ from autogen_core import BaseAgent, default_subscription, ResearchFacts
+
+ @default_subscription # Uses 'default' topic type, infers agent type 'writer'
+ class WriterAgent(BaseAgent):
+ # Agent logic here...
+ async def on_message_impl(self, message: ResearchFacts, ctx): ...
+
+ # Or specify the topic type
+ @default_subscription(topic_type="research.facts.available")
+ class SpecificWriterAgent(BaseAgent):
+ # Agent logic here...
+ async def on_message_impl(self, message: ResearchFacts, ctx): ...
+ ```
+
+The actual sending (`publish_message`) and routing logic reside within the `AgentRuntime`, which we'll explore next.
+
+## Next Steps
+
+You've learned how AutoGen Core uses a publish/subscribe system (`TopicId`, `Subscription`) to allow agents to communicate without direct coupling. This is crucial for building flexible and scalable multi-agent applications.
+
+* **Topic (`TopicId`):** Named channels (`type`/`source`) for broadcasting messages.
+* **Publish:** Sending a message to a Topic.
+* **Subscription:** An agent's declared interest in messages on certain Topics, defining a routing rule.
+
+Now, let's dive into the orchestrator that manages agents and makes this messaging system work:
+
+* [Chapter 3: AgentRuntime](03_agentruntime.md): The manager responsible for creating, running, and connecting agents, including handling message publishing and subscription routing.
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/AutoGen Core/03_agentruntime.md b/output/AutoGen Core/03_agentruntime.md
new file mode 100644
index 0000000..09aec0e
--- /dev/null
+++ b/output/AutoGen Core/03_agentruntime.md
@@ -0,0 +1,349 @@
+# Chapter 3: AgentRuntime - The Office Manager
+
+In [Chapter 1: Agent](01_agent.md), we met the workers (`Agent`) of our system. In [Chapter 2: Messaging System](02_messaging_system__topic___subscription_.md), we saw how they can communicate broadly using topics and subscriptions. But who hires these agents? Who actually delivers the messages, whether direct or published? And who keeps the whole system running smoothly?
+
+This is where the **`AgentRuntime`** comes in. It's the central nervous system, the operating system, or perhaps the most fitting analogy: **the office manager** for all your agents.
+
+## Motivation: Why Do We Need an Office Manager?
+
+Imagine an office full of employees (Agents). You have researchers, writers, maybe coders.
+* How does a new employee get hired and set up?
+* When one employee wants to send a memo directly to another, who makes sure it gets to the right desk?
+* When someone posts an announcement on the company bulletin board (publishes to a topic), who ensures everyone who signed up for that type of announcement sees it?
+* Who starts the workday and ensures everything keeps running?
+
+Without an office manager, it would be chaos! The `AgentRuntime` serves this crucial role in AutoGen Core. It handles:
+
+1. **Agent Creation:** "Onboarding" new agents when they are needed.
+2. **Message Routing:** Delivering direct messages (`send_message`) and published messages (`publish_message`).
+3. **Lifecycle Management:** Starting, running, and stopping the whole system.
+4. **State Management:** Keeping track of the overall system state (optional).
+
+## Key Concepts: Understanding the Manager's Job
+
+Let's break down the main responsibilities of the `AgentRuntime`:
+
+1. **Agent Instantiation (Hiring):**
+ * You don't usually create agent objects directly (like `my_agent = ResearcherAgent()`). Why? Because the agent needs to know *about* the runtime (the office it works in) to send messages, publish announcements, etc.
+ * Instead, you tell the `AgentRuntime`: "I need an agent of type 'researcher'. Here's a recipe (a **factory function**) for how to create one." This is done using `runtime.register_factory(...)`.
+ * When a message needs to go to a 'researcher' agent with a specific key (e.g., 'researcher-01'), the runtime checks if it already exists. If not, it uses the registered factory function to create (instantiate) the agent.
+ * **Crucially**, while creating the agent, the runtime provides special context (`AgentInstantiationContext`) so the new agent automatically gets its unique `AgentId` and a reference to the `AgentRuntime` itself. This is like giving a new employee their ID badge and telling them who the office manager is.
+
+ ```python
+ # Simplified Concept - How a BaseAgent gets its ID and runtime access
+ # From: _agent_instantiation.py and _base_agent.py
+
+ # Inside the agent's __init__ method (when inheriting from BaseAgent):
+ class MyAgent(BaseAgent):
+ def __init__(self, description: str):
+ # This magic happens *because* the AgentRuntime is creating the agent
+ # inside a special context.
+ self._runtime = AgentInstantiationContext.current_runtime() # Gets the manager
+ self._id = AgentInstantiationContext.current_agent_id() # Gets its own ID
+ self._description = description
+ # ... rest of initialization ...
+ ```
+ This ensures agents are properly integrated into the system from the moment they are created.
+
+2. **Message Delivery (Mail Room):**
+ * **Direct Send (`send_message`):** When an agent calls `await agent_context.send_message(message, recipient_id)`, it's actually telling the `AgentRuntime`, "Please deliver this `message` directly to the agent identified by `recipient_id`." The runtime finds the recipient agent (creating it if necessary) and calls its `on_message` method. It's like putting a specific name on an envelope and handing it to the mail room.
+ * **Publish (`publish_message`):** When an agent calls `await agent_context.publish_message(message, topic_id)`, it tells the runtime, "Post this `message` to the announcement board named `topic_id`." The runtime then checks its list of **subscriptions** (who signed up for which boards). For every matching subscription, it figures out the correct recipient agent(s) (based on the subscription rule) and delivers the message to their `on_message` method.
+
+3. **Lifecycle Management (Opening/Closing the Office):**
+ * The runtime needs to be started to begin processing messages. Typically, you call `runtime.start()`. This usually kicks off a background process or loop that watches for incoming messages.
+ * When work is done, you need to stop the runtime gracefully. `runtime.stop_when_idle()` is common โ it waits until all messages currently in the queue have been processed, then stops. `runtime.stop()` stops more abruptly.
+
+4. **State Management (Office Records):**
+ * The runtime can save the state of *all* the agents it manages (`runtime.save_state()`) and load it back later (`runtime.load_state()`). This is useful for pausing and resuming complex multi-agent interactions. It can also save/load state for individual agents (`runtime.agent_save_state()` / `runtime.agent_load_state()`). We'll touch more on state in [Chapter 7: Memory](07_memory.md).
+
+## Use Case Example: Running Our Researcher and Writer
+
+Let's finally run the Researcher/Writer scenario from Chapters 1 and 2. We need the `AgentRuntime` to make it happen.
+
+**Goal:**
+1. Create a runtime.
+2. Register factories for a 'researcher' and a 'writer' agent.
+3. Tell the runtime that 'writer' agents are interested in "research.facts.available" topics (add subscription).
+4. Start the runtime.
+5. Send an initial `ResearchTopic` message to a 'researcher' agent.
+6. Let the system run (Researcher publishes facts, Runtime delivers to Writer via subscription, Writer processes).
+7. Stop the runtime when idle.
+
+**Code Snippets (Simplified):**
+
+```python
+# 0. Imports and Message Definitions (from previous chapters)
+import asyncio
+from dataclasses import dataclass
+from autogen_core import (
+ AgentId, BaseAgent, SingleThreadedAgentRuntime, TopicId,
+ MessageContext, TypeSubscription, AgentInstantiationContext
+)
+
+@dataclass
+class ResearchTopic: topic: str
+@dataclass
+class ResearchFacts: topic: str; facts: list[str]
+```
+These are the messages our agents will exchange.
+
+```python
+# 1. Define Agent Logic (using BaseAgent)
+
+class ResearcherAgent(BaseAgent):
+ async def on_message_impl(self, message: ResearchTopic, ctx: MessageContext):
+ print(f"Researcher ({self.id}) got topic: {message.topic}")
+ facts = [f"Fact 1 about {message.topic}", f"Fact 2"]
+ results_topic = TopicId("research.facts.available", message.topic)
+ # Use the runtime (via self.publish_message helper) to publish
+ await self.publish_message(
+ ResearchFacts(topic=message.topic, facts=facts), results_topic
+ )
+ print(f"Researcher ({self.id}) published facts to {results_topic}")
+
+class WriterAgent(BaseAgent):
+ async def on_message_impl(self, message: ResearchFacts, ctx: MessageContext):
+ print(f"Writer ({self.id}) received facts via topic '{ctx.topic_id}': {message.facts}")
+ draft = f"Draft for {message.topic}: {'; '.join(message.facts)}"
+ print(f"Writer ({self.id}) created draft: '{draft}'")
+ # This agent doesn't send further messages in this example
+```
+Here we define the behavior of our two agent types, inheriting from `BaseAgent` which gives us `self.id`, `self.publish_message`, etc.
+
+```python
+# 2. Define Agent Factories
+
+def researcher_factory():
+ # Gets runtime/id via AgentInstantiationContext inside BaseAgent.__init__
+ print("Runtime is creating a ResearcherAgent...")
+ return ResearcherAgent(description="I research topics.")
+
+def writer_factory():
+ print("Runtime is creating a WriterAgent...")
+ return WriterAgent(description="I write drafts from facts.")
+```
+These simple functions tell the runtime *how* to create instances of our agents when needed.
+
+```python
+# 3. Setup and Run the Runtime
+
+async def main():
+ # Create the runtime (the office manager)
+ runtime = SingleThreadedAgentRuntime()
+
+ # Register the factories (tell the manager how to hire)
+ await runtime.register_factory("researcher", researcher_factory)
+ await runtime.register_factory("writer", writer_factory)
+ print("Registered agent factories.")
+
+ # Add the subscription (tell manager who listens to which announcements)
+ # Rule: Messages to topics of type "research.facts.available"
+ # should go to a "writer" agent whose key matches the topic source.
+ writer_sub = TypeSubscription(topic_type="research.facts.available", agent_type="writer")
+ await runtime.add_subscription(writer_sub)
+ print(f"Added subscription: {writer_sub.id}")
+
+ # Start the runtime (open the office)
+ runtime.start()
+ print("Runtime started.")
+
+ # Send the initial message to kick things off
+ research_task_topic = "AutoGen Agents"
+ researcher_instance_id = AgentId(type="researcher", key=research_task_topic)
+ print(f"Sending initial topic '{research_task_topic}' to {researcher_instance_id}")
+ await runtime.send_message(
+ message=ResearchTopic(topic=research_task_topic),
+ recipient=researcher_instance_id,
+ )
+
+ # Wait until all messages are processed (wait for work day to end)
+ print("Waiting for runtime to become idle...")
+ await runtime.stop_when_idle()
+ print("Runtime stopped.")
+
+# Run the main function
+asyncio.run(main())
+```
+This script sets up the `SingleThreadedAgentRuntime`, registers the blueprints (factories) and communication rules (subscription), starts the process, and then shuts down cleanly.
+
+**Expected Output (Conceptual Order):**
+
+```
+Registered agent factories.
+Added subscription: type=research.facts.available=>agent=writer
+Runtime started.
+Sending initial topic 'AutoGen Agents' to researcher/AutoGen Agents
+Waiting for runtime to become idle...
+Runtime is creating a ResearcherAgent... # First time researcher/AutoGen Agents is needed
+Researcher (researcher/AutoGen Agents) got topic: AutoGen Agents
+Researcher (researcher/AutoGen Agents) published facts to research.facts.available/AutoGen Agents
+Runtime is creating a WriterAgent... # First time writer/AutoGen Agents is needed (due to subscription)
+Writer (writer/AutoGen Agents) received facts via topic 'research.facts.available/AutoGen Agents': ['Fact 1 about AutoGen Agents', 'Fact 2']
+Writer (writer/AutoGen Agents) created draft: 'Draft for AutoGen Agents: Fact 1 about AutoGen Agents; Fact 2'
+Runtime stopped.
+```
+You can see the runtime orchestrating the creation of agents and the flow of messages based on the initial request and the subscription rule.
+
+## Under the Hood: How the Manager Works
+
+Let's peek inside the `SingleThreadedAgentRuntime` (a common implementation provided by AutoGen Core) to understand the flow.
+
+**Core Idea:** It uses an internal queue (`_message_queue`) to hold incoming requests (`send_message`, `publish_message`). A background task continuously takes items from the queue and processes them one by one (though the *handling* of a message might involve `await` and allow other tasks to run).
+
+**1. Agent Creation (`_get_agent`, `_invoke_agent_factory`)**
+
+When the runtime needs an agent instance (e.g., to deliver a message) that hasn't been created yet:
+
+```mermaid
+sequenceDiagram
+ participant Runtime as AgentRuntime
+ participant Factory as Agent Factory Func
+ participant AgentCtx as AgentInstantiationContext
+ participant Agent as New Agent Instance
+
+ Runtime->>Runtime: Check if agent instance exists (e.g., in `_instantiated_agents` dict)
+ alt Agent Not Found
+ Runtime->>Runtime: Find registered factory for agent type
+ Runtime->>AgentCtx: Set current runtime & agent_id
+ activate AgentCtx
+ Runtime->>Factory: Call factory function()
+ activate Factory
+ Factory->>AgentCtx: (Inside Agent.__init__) Get current runtime
+ AgentCtx-->>Factory: Return runtime
+ Factory->>AgentCtx: (Inside Agent.__init__) Get current agent_id
+ AgentCtx-->>Factory: Return agent_id
+ Factory-->>Runtime: Return new Agent instance
+ deactivate Factory
+ Runtime->>AgentCtx: Clear context
+ deactivate AgentCtx
+ Runtime->>Runtime: Store new agent instance
+ end
+ Runtime->>Runtime: Return agent instance
+```
+
+* The runtime looks up the factory function registered for the required `AgentId.type`.
+* It uses `AgentInstantiationContext.populate_context` to temporarily store its own reference and the target `AgentId`.
+* It calls the factory function.
+* Inside the agent's `__init__` (usually via `BaseAgent`), `AgentInstantiationContext.current_runtime()` and `AgentInstantiationContext.current_agent_id()` are called to retrieve the context set by the runtime.
+* The factory returns the fully initialized agent instance.
+* The runtime stores this instance for future use.
+
+```python
+# From: _agent_instantiation.py (Simplified)
+class AgentInstantiationContext:
+ _CONTEXT_VAR = ContextVar("agent_context") # Stores (runtime, agent_id)
+
+ @classmethod
+ @contextmanager
+ def populate_context(cls, ctx: tuple[AgentRuntime, AgentId]):
+ token = cls._CONTEXT_VAR.set(ctx) # Store context for this block
+ try:
+ yield # Code inside the 'with' block runs here
+ finally:
+ cls._CONTEXT_VAR.reset(token) # Clean up context
+
+ @classmethod
+ def current_runtime(cls) -> AgentRuntime:
+ return cls._CONTEXT_VAR.get()[0] # Retrieve runtime from context
+
+ @classmethod
+ def current_agent_id(cls) -> AgentId:
+ return cls._CONTEXT_VAR.get()[1] # Retrieve agent_id from context
+```
+This context manager pattern ensures the correct runtime and ID are available *only* during the agent's creation by the runtime.
+
+**2. Direct Messaging (`send_message` -> `_process_send`)**
+
+```mermaid
+sequenceDiagram
+ participant Sender as Sending Agent/Code
+ participant Runtime as AgentRuntime
+ participant Queue as Internal Queue
+ participant Recipient as Recipient Agent
+
+ Sender->>+Runtime: send_message(msg, recipient_id, ...)
+ Runtime->>Runtime: Create Future (for response)
+ Runtime->>+Queue: Put SendMessageEnvelope(msg, recipient_id, future)
+ Runtime-->>-Sender: Return awaitable Future
+ Note over Queue, Runtime: Background task picks up envelope
+ Runtime->>Runtime: _process_send(envelope)
+ Runtime->>+Recipient: _get_agent(recipient_id) (creates if needed)
+ Recipient-->>-Runtime: Return Agent instance
+ Runtime->>+Recipient: on_message(msg, context)
+ Recipient->>Recipient: Process message...
+ Recipient-->>-Runtime: Return response value
+ Runtime->>Runtime: Set Future result with response value
+```
+
+* `send_message` creates a `Future` object (a placeholder for the eventual result) and wraps the message details in a `SendMessageEnvelope`.
+* This envelope is put onto the internal `_message_queue`.
+* The background task picks up the envelope.
+* `_process_send` gets the recipient agent instance (using `_get_agent`).
+* It calls the recipient's `on_message` method.
+* When `on_message` returns a result, `_process_send` sets the result on the `Future` object, which makes the original `await runtime.send_message(...)` call return the value.
+
+**3. Publish/Subscribe (`publish_message` -> `_process_publish`)**
+
+```mermaid
+sequenceDiagram
+ participant Publisher as Publishing Agent/Code
+ participant Runtime as AgentRuntime
+ participant Queue as Internal Queue
+ participant SubManager as SubscriptionManager
+ participant Subscriber as Subscribed Agent
+
+ Publisher->>+Runtime: publish_message(msg, topic_id, ...)
+ Runtime->>+Queue: Put PublishMessageEnvelope(msg, topic_id)
+ Runtime-->>-Publisher: Return (None for publish)
+ Note over Queue, Runtime: Background task picks up envelope
+ Runtime->>Runtime: _process_publish(envelope)
+ Runtime->>+SubManager: get_subscribed_recipients(topic_id)
+ SubManager->>SubManager: Find matching subscriptions
+ SubManager->>SubManager: Map subscriptions to AgentIds
+ SubManager-->>-Runtime: Return list of recipient AgentIds
+ loop For each recipient AgentId
+ Runtime->>+Subscriber: _get_agent(recipient_id) (creates if needed)
+ Subscriber-->>-Runtime: Return Agent instance
+ Runtime->>+Subscriber: on_message(msg, context with topic_id)
+ Subscriber->>Subscriber: Process message...
+ Subscriber-->>-Runtime: Return (usually None for publish)
+ end
+```
+
+* `publish_message` wraps the message in a `PublishMessageEnvelope` and puts it on the queue.
+* The background task picks it up.
+* `_process_publish` asks the `SubscriptionManager` (`_subscription_manager`) for all `AgentId`s that are subscribed to the given `topic_id`.
+* The `SubscriptionManager` checks its registered `Subscription` objects (`_subscriptions` list, added via `add_subscription`). For each `Subscription` where `is_match(topic_id)` is true, it calls `map_to_agent(topic_id)` to get the target `AgentId`.
+* For each resulting `AgentId`, the runtime gets the agent instance and calls its `on_message` method, providing the `topic_id` in the `MessageContext`.
+
+```python
+# From: _runtime_impl_helpers.py (SubscriptionManager simplified)
+class SubscriptionManager:
+ def __init__(self):
+ self._subscriptions: List[Subscription] = []
+ # Optimization cache can be added here
+
+ async def add_subscription(self, subscription: Subscription):
+ self._subscriptions.append(subscription)
+ # Clear cache if any
+
+ async def get_subscribed_recipients(self, topic: TopicId) -> List[AgentId]:
+ recipients = []
+ for sub in self._subscriptions:
+ if sub.is_match(topic):
+ recipients.append(sub.map_to_agent(topic))
+ return recipients
+```
+The `SubscriptionManager` simply iterates through registered subscriptions to find matches when a message is published.
+
+## Next Steps
+
+You now understand the `AgentRuntime` - the essential coordinator that brings Agents to life, manages their communication, and runs the entire show. It handles agent creation via factories, routes direct and published messages, and manages the system's lifecycle.
+
+With the core concepts of `Agent`, `Messaging`, and `AgentRuntime` covered, we can start looking at more specialized building blocks. Next, we'll explore how agents can use external capabilities:
+
+* [Chapter 4: Tool](04_tool.md): How to give agents tools (like functions or APIs) to perform specific actions beyond just processing messages.
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
diff --git a/output/AutoGen Core/04_tool.md b/output/AutoGen Core/04_tool.md
new file mode 100644
index 0000000..e526be1
--- /dev/null
+++ b/output/AutoGen Core/04_tool.md
@@ -0,0 +1,272 @@
+# Chapter 4: Tool - Giving Agents Specific Capabilities
+
+In the previous chapters, we learned about Agents as workers ([Chapter 1](01_agent.md)), how they can communicate directly or using announcements ([Chapter 2](02_messaging_system__topic___subscription_.md)), and the `AgentRuntime` that manages them ([Chapter 3](03_agentruntime.md)).
+
+Agents can process messages and coordinate, but what if an agent needs to perform a very specific action, like looking up information online, running a piece of code, accessing a database, or even just finding out the current date? They need specialized *capabilities*.
+
+This is where the concept of a **Tool** comes in.
+
+## Motivation: Agents Need Skills!
+
+Imagine our `Writer` agent from before. It receives facts and writes a draft. Now, let's say we want the `Writer` (or perhaps a smarter `Assistant` agent helping it) to always include the current date in the blog post title.
+
+How does the agent get the current date? It doesn't inherently know it. It needs a specific *skill* or *tool* for that.
+
+A `Tool` in AutoGen Core represents exactly this: a specific, well-defined capability that an Agent can use. Think of it like giving an employee (Agent) a specialized piece of equipment (Tool), like a calculator, a web browser, or a calendar lookup program.
+
+## Key Concepts: Understanding Tools
+
+Let's break down what defines a Tool:
+
+1. **It's a Specific Capability:** A Tool performs one well-defined task. Examples:
+ * `search_web(query: str)`
+ * `run_python_code(code: str)`
+ * `get_stock_price(ticker: str)`
+ * `get_current_date()`
+
+2. **It Has a Schema (The Manual):** This is crucial! For an Agent (especially one powered by a Large Language Model - LLM) to know *when* and *how* to use a tool, the tool needs a clear description or "manual". This is called the `ToolSchema`. It typically includes:
+ * **`name`**: A unique identifier for the tool (e.g., `get_current_date`).
+ * **`description`**: A clear explanation of what the tool does, which helps the LLM decide if this tool is appropriate for the current task (e.g., "Fetches the current date in YYYY-MM-DD format").
+ * **`parameters`**: Defines what inputs the tool needs. This is itself a schema (`ParametersSchema`) describing the input fields, their types, and which ones are required. For our `get_current_date` example, it might need no parameters. For `get_stock_price`, it would need a `ticker` parameter of type string.
+
+ ```python
+ # From: tools/_base.py (Simplified Concept)
+ from typing import TypedDict, Dict, Any, Sequence, NotRequired
+
+ class ParametersSchema(TypedDict):
+ type: str # Usually "object"
+ properties: Dict[str, Any] # Defines input fields and their types
+ required: NotRequired[Sequence[str]] # List of required field names
+
+ class ToolSchema(TypedDict):
+ name: str
+ description: NotRequired[str]
+ parameters: NotRequired[ParametersSchema]
+ # 'strict' flag also possible (Chapter 5 related)
+ ```
+ This schema allows an LLM to understand: "Ah, there's a tool called `get_current_date` that takes no inputs and gives me the current date. I should use that now!"
+
+3. **It Can Be Executed:** Once an agent decides to use a tool (often based on the schema), there needs to be a mechanism to actually *run* the tool's underlying function and get the result.
+
+## Use Case Example: Adding a `get_current_date` Tool
+
+Let's equip an agent with the ability to find the current date.
+
+**Goal:** Define a tool that gets the current date and show how it could be executed by a specialized agent.
+
+**Step 1: Define the Python Function**
+
+First, we need the actual Python code that performs the action.
+
+```python
+# File: get_date_function.py
+import datetime
+
+def get_current_date() -> str:
+ """Fetches the current date as a string."""
+ today = datetime.date.today()
+ return today.isoformat() # Returns date like "2023-10-27"
+
+# Test the function
+print(f"Function output: {get_current_date()}")
+```
+This is a standard Python function. It takes no arguments and returns the date as a string.
+
+**Step 2: Wrap it as a `FunctionTool`**
+
+AutoGen Core provides a convenient way to turn a Python function like this into a `Tool` object using `FunctionTool`. It automatically inspects the function's signature (arguments and return type) and docstring to help build the `ToolSchema`.
+
+```python
+# File: create_date_tool.py
+from autogen_core.tools import FunctionTool
+from get_date_function import get_current_date # Import our function
+
+# Create the Tool instance
+# We provide the function and a clear description for the LLM
+date_tool = FunctionTool(
+ func=get_current_date,
+ description="Use this tool to get the current date in YYYY-MM-DD format."
+ # Name defaults to function name 'get_current_date'
+)
+
+# Let's see what FunctionTool generated
+print(f"Tool Name: {date_tool.name}")
+print(f"Tool Description: {date_tool.description}")
+
+# The schema defines inputs (none in this case)
+# print(f"Tool Schema Parameters: {date_tool.schema['parameters']}")
+# Output (simplified): {'type': 'object', 'properties': {}, 'required': []}
+```
+`FunctionTool` wraps our `get_current_date` function. It uses the function name as the tool name and the description we provided. It also correctly determines from the function signature that there are no input parameters (`properties: {}`).
+
+**Step 3: How an Agent Might Request Tool Use**
+
+Now we have a `date_tool`. How is it used? Typically, an LLM-powered agent (which we'll see more of in [Chapter 5: ChatCompletionClient](05_chatcompletionclient.md)) analyzes a request and decides a tool is needed. It then generates a request to *call* that tool, often using a specific message type like `FunctionCall`.
+
+```python
+# File: tool_call_request.py
+from autogen_core import FunctionCall # Represents a request to call a tool
+
+# Imagine an LLM agent decided to use the date tool.
+# It constructs this message, providing the tool name and arguments (as JSON string).
+date_call_request = FunctionCall(
+ id="call_date_001", # A unique ID for this specific call attempt
+ name="get_current_date", # Matches the Tool's name
+ arguments="{}" # An empty JSON object because no arguments are needed
+)
+
+print("FunctionCall message:", date_call_request)
+# Output: FunctionCall(id='call_date_001', name='get_current_date', arguments='{}')
+```
+This `FunctionCall` message is like a work order: "Please execute the tool named `get_current_date` with these arguments."
+
+**Step 4: The `ToolAgent` Executes the Tool**
+
+Who receives this `FunctionCall` message? Usually, a specialized agent called `ToolAgent`. You create a `ToolAgent` and give it the list of tools it knows how to execute. When it receives a `FunctionCall`, it finds the matching tool and runs it.
+
+```python
+# File: tool_agent_example.py
+import asyncio
+from autogen_core.tool_agent import ToolAgent
+from autogen_core.models import FunctionExecutionResult
+from create_date_tool import date_tool # Import the tool we created
+from tool_call_request import date_call_request # Import the request message
+
+# Create an agent specifically designed to execute tools
+tool_executor = ToolAgent(
+ description="I can execute tools like getting the date.",
+ tools=[date_tool] # Give it the list of tools it manages
+)
+
+# --- Simulation of Runtime delivering the message ---
+# In a real app, the AgentRuntime (Chapter 3) would route the
+# date_call_request message to this tool_executor agent.
+# We simulate the call to its message handler here:
+
+async def simulate_execution():
+ # Fake context (normally provided by runtime)
+ class MockContext: cancellation_token = None
+ ctx = MockContext()
+
+ print(f"ToolAgent received request: {date_call_request.name}")
+ result: FunctionExecutionResult = await tool_executor.handle_function_call(
+ message=date_call_request,
+ ctx=ctx
+ )
+ print(f"ToolAgent produced result: {result}")
+
+asyncio.run(simulate_execution())
+```
+
+**Expected Output:**
+
+```
+ToolAgent received request: get_current_date
+ToolAgent produced result: FunctionExecutionResult(content='2023-10-27', call_id='call_date_001', is_error=False, name='get_current_date') # Date will be current date
+```
+The `ToolAgent` received the `FunctionCall`, found the `date_tool` in its list, executed the underlying `get_current_date` function, and packaged the result (the date string) into a `FunctionExecutionResult` message. This result message can then be sent back to the agent that originally requested the tool use.
+
+## Under the Hood: How Tool Execution Works
+
+Let's visualize the typical flow when an LLM agent decides to use a tool managed by a `ToolAgent`.
+
+**Conceptual Flow:**
+
+```mermaid
+sequenceDiagram
+ participant LLMA as LLM Agent (Decides)
+ participant Caller as Caller Agent (Orchestrates)
+ participant ToolA as ToolAgent (Executes)
+ participant ToolFunc as Tool Function (e.g., get_current_date)
+
+ Note over LLMA: Analyzes conversation, decides tool needed.
+ LLMA->>Caller: Sends AssistantMessage containing FunctionCall(name='get_current_date', args='{}')
+ Note over Caller: Receives LLM response, sees FunctionCall.
+ Caller->>+ToolA: Uses runtime.send_message(message=FunctionCall, recipient=ToolAgent_ID)
+ Note over ToolA: Receives FunctionCall via on_message.
+ ToolA->>ToolA: Looks up 'get_current_date' in its internal list of Tools.
+ ToolA->>+ToolFunc: Calls tool.run_json(args={}) -> triggers get_current_date()
+ ToolFunc-->>-ToolA: Returns the result (e.g., "2023-10-27")
+ ToolA->>ToolA: Creates FunctionExecutionResult message with the content.
+ ToolA-->>-Caller: Returns FunctionExecutionResult via runtime messaging.
+ Note over Caller: Receives the tool result.
+ Caller->>LLMA: Sends FunctionExecutionResultMessage to LLM for next step.
+ Note over LLMA: Now knows the current date.
+```
+
+1. **Decision:** An LLM-powered agent decides a tool is needed based on the conversation and the available tools' descriptions. It generates a `FunctionCall`.
+2. **Request:** A "Caller" agent (often the same LLM agent or a managing agent) sends this `FunctionCall` message to the dedicated `ToolAgent` using the `AgentRuntime`.
+3. **Lookup:** The `ToolAgent` receives the message, extracts the tool `name` (`get_current_date`), and finds the corresponding `Tool` object (our `date_tool`) in the list it was configured with.
+4. **Execution:** The `ToolAgent` calls the `run_json` method on the `Tool` object, passing the arguments from the `FunctionCall`. For a `FunctionTool`, `run_json` validates the arguments against the generated schema and then executes the original Python function (`get_current_date`).
+5. **Result:** The Python function returns its result (the date string).
+6. **Response:** The `ToolAgent` wraps this result string in a `FunctionExecutionResult` message, including the original `call_id`, and sends it back to the Caller agent.
+7. **Continuation:** The Caller agent typically sends this result back to the LLM agent, allowing the conversation or task to continue with the new information.
+
+**Code Glimpse:**
+
+* **`Tool` Protocol (`tools/_base.py`):** Defines the basic contract any tool must fulfill. Key methods are `schema` (property returning the `ToolSchema`) and `run_json` (method to execute the tool with JSON-like arguments).
+* **`BaseTool` (`tools/_base.py`):** An abstract class that helps implement the `Tool` protocol, especially using Pydantic models for defining arguments (`args_type`) and return values (`return_type`). It automatically generates the `parameters` part of the schema from the `args_type` model.
+* **`FunctionTool` (`tools/_function_tool.py`):** Inherits from `BaseTool`. Its magic lies in automatically creating the `args_type` Pydantic model by inspecting the wrapped Python function's signature (`args_base_model_from_signature`). Its `run` method handles calling the original sync or async Python function.
+ ```python
+ # Inside FunctionTool (Simplified Concept)
+ class FunctionTool(BaseTool[BaseModel, BaseModel]):
+ def __init__(self, func, description, ...):
+ self._func = func
+ self._signature = get_typed_signature(func)
+ # Automatically create Pydantic model for arguments
+ args_model = args_base_model_from_signature(...)
+ # Get return type from signature
+ return_type = self._signature.return_annotation
+ super().__init__(args_model, return_type, ...)
+
+ async def run(self, args: BaseModel, ...):
+ # Extract arguments from the 'args' model
+ kwargs = args.model_dump()
+ # Call the original Python function (sync or async)
+ result = await self._call_underlying_func(**kwargs)
+ return result # Must match the expected return_type
+ ```
+* **`ToolAgent` (`tool_agent/_tool_agent.py`):** A specialized `RoutedAgent`. It registers a handler specifically for `FunctionCall` messages.
+ ```python
+ # Inside ToolAgent (Simplified Concept)
+ class ToolAgent(RoutedAgent):
+ def __init__(self, ..., tools: List[Tool]):
+ super().__init__(...)
+ self._tools = {tool.name: tool for tool in tools} # Store tools by name
+
+ @message_handler # Registers this for FunctionCall messages
+ async def handle_function_call(self, message: FunctionCall, ctx: MessageContext):
+ # Find the tool by name
+ tool = self._tools.get(message.name)
+ if tool is None:
+ # Handle error: Tool not found
+ raise ToolNotFoundException(...)
+ try:
+ # Parse arguments string into a dictionary
+ arguments = json.loads(message.arguments)
+ # Execute the tool's run_json method
+ result_obj = await tool.run_json(args=arguments, ...)
+ # Convert result object back to string if needed
+ result_str = tool.return_value_as_string(result_obj)
+ # Create the success result message
+ return FunctionExecutionResult(content=result_str, ...)
+ except Exception as e:
+ # Handle execution errors
+ return FunctionExecutionResult(content=f"Error: {e}", is_error=True, ...)
+ ```
+ Its core logic is: find tool -> parse args -> run tool -> return result/error.
+
+## Next Steps
+
+You've learned how **Tools** provide specific capabilities to Agents, defined by a **Schema** that LLMs can understand. We saw how `FunctionTool` makes it easy to wrap existing Python functions and how `ToolAgent` acts as the executor for these tools.
+
+This ability for agents to use tools is fundamental to building powerful and versatile AI systems that can interact with the real world or perform complex calculations.
+
+Now that agents can use tools, we need to understand more about the agents that *decide* which tools to use, which often involves interacting with Large Language Models:
+
+* [Chapter 5: ChatCompletionClient](05_chatcompletionclient.md): How agents interact with LLMs like GPT to generate responses or decide on actions (like calling a tool).
+* [Chapter 6: ChatCompletionContext](06_chatcompletioncontext.md): How the history of the conversation, including tool calls and results, is managed when talking to an LLM.
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/AutoGen Core/05_chatcompletionclient.md b/output/AutoGen Core/05_chatcompletionclient.md
new file mode 100644
index 0000000..fb8d5e0
--- /dev/null
+++ b/output/AutoGen Core/05_chatcompletionclient.md
@@ -0,0 +1,296 @@
+# Chapter 5: ChatCompletionClient - Talking to the Brains
+
+So far, we've learned about:
+* [Agents](01_agent.md): The workers in our system.
+* [Messaging](02_messaging_system__topic___subscription_.md): How agents communicate broadly.
+* [AgentRuntime](03_agentruntime.md): The manager that runs the show.
+* [Tools](04_tool.md): How agents get specific skills.
+
+But how does an agent actually *think* or *generate text*? Many powerful agents rely on Large Language Models (LLMs) โ think of models like GPT-4, Claude, or Gemini โ as their "brains". How does an agent in AutoGen Core communicate with these external LLM services?
+
+This is where the **`ChatCompletionClient`** comes in. It's the dedicated component for talking to LLMs.
+
+## Motivation: Bridging the Gap to LLMs
+
+Imagine you want to build an agent that can summarize long articles.
+1. You give the agent an article (as a message).
+2. The agent needs to send this article to an LLM (like GPT-4).
+3. It also needs to tell the LLM: "Please summarize this."
+4. The LLM processes the request and generates a summary.
+5. The agent needs to receive this summary back from the LLM.
+
+How does the agent handle the technical details of connecting to the LLM's specific API, formatting the request correctly, sending it over the internet, and understanding the response?
+
+The `ChatCompletionClient` solves this! Think of it as the **standard phone line and translator** connecting your agent to the LLM service. You tell the client *what* to say (the conversation history and instructions), and it handles *how* to say it to the specific LLM and translates the LLM's reply back into a standard format.
+
+## Key Concepts: Understanding the LLM Communicator
+
+Let's break down the `ChatCompletionClient`:
+
+1. **LLM Communication Bridge:** It's the primary way AutoGen agents interact with external LLM APIs (like OpenAI, Anthropic, Google Gemini, etc.). It hides the complexity of specific API calls.
+
+2. **Standard Interface (`create` method):** It defines a common way to send requests and receive responses, regardless of the underlying LLM. The core method is `create`. You give it:
+ * `messages`: A list of messages representing the conversation history so far.
+ * Optional `tools`: A list of tools ([Chapter 4](04_tool.md)) the LLM might be able to use.
+ * Other parameters (like `json_output` hints, `cancellation_token`).
+
+3. **Messages (`LLMMessage`):** The conversation history is passed as a sequence of specific message types defined in `autogen_core.models`:
+ * `SystemMessage`: Instructions for the LLM (e.g., "You are a helpful assistant.").
+ * `UserMessage`: Input from the user or another agent (e.g., the article text).
+ * `AssistantMessage`: Previous responses from the LLM (can include text or requests to call functions/tools).
+ * `FunctionExecutionResultMessage`: The results of executing a tool/function call.
+
+4. **Tools (`ToolSchema`):** You can provide the schemas of available tools ([Chapter 4](04_tool.md)). The LLM might then respond not with text, but with a request to call one of these tools (`FunctionCall` inside an `AssistantMessage`).
+
+5. **Response (`CreateResult`):** The `create` method returns a standard `CreateResult` object containing:
+ * `content`: The LLM's generated text or a list of `FunctionCall` requests.
+ * `finish_reason`: Why the LLM stopped generating (e.g., "stop", "length", "function_calls").
+ * `usage`: How many input (`prompt_tokens`) and output (`completion_tokens`) tokens were used.
+ * `cached`: Whether the response came from a cache.
+
+6. **Token Tracking:** The client automatically tracks token usage (`prompt_tokens`, `completion_tokens`) for each call. You can query the total usage via methods like `total_usage()`. This is vital for monitoring costs, as most LLM APIs charge based on tokens.
+
+## Use Case Example: Summarizing Text with an LLM
+
+Let's build a simplified scenario where we use a `ChatCompletionClient` to ask an LLM to summarize text.
+
+**Goal:** Send text to an LLM via a client and get a summary back.
+
+**Step 1: Prepare the Input Messages**
+
+We need to structure our request as a list of `LLMMessage` objects.
+
+```python
+# File: prepare_messages.py
+from autogen_core.models import SystemMessage, UserMessage
+
+# Instructions for the LLM
+system_prompt = SystemMessage(
+ content="You are a helpful assistant designed to summarize text concisely."
+)
+
+# The text we want to summarize
+article_text = """
+AutoGen is a framework that enables the development of LLM applications using multiple agents
+that can converse with each other to solve tasks. AutoGen agents are customizable,
+conversable, and can seamlessly allow human participation. They can operate in various modes
+that employ combinations of LLMs, human inputs, and tools.
+"""
+user_request = UserMessage(
+ content=f"Please summarize the following text in one sentence:\n\n{article_text}",
+ source="User" # Indicate who provided this input
+)
+
+# Combine into a list for the client
+messages_to_send = [system_prompt, user_request]
+
+print("Messages prepared:")
+for msg in messages_to_send:
+ print(f"- {msg.type}: {msg.content[:50]}...") # Print first 50 chars
+```
+This code defines the instructions (`SystemMessage`) and the user's request (`UserMessage`) and puts them in a list, ready to be sent.
+
+**Step 2: Use the ChatCompletionClient (Conceptual)**
+
+Now, we need an instance of a `ChatCompletionClient`. In a real application, you'd configure a specific client (like `OpenAIChatCompletionClient` with your API key). For this example, let's imagine we have a pre-configured client called `llm_client`.
+
+```python
+# File: call_llm_client.py
+import asyncio
+from autogen_core.models import CreateResult, RequestUsage
+# Assume 'messages_to_send' is from the previous step
+# Assume 'llm_client' is a pre-configured ChatCompletionClient instance
+# (e.g., llm_client = OpenAIChatCompletionClient(config=...))
+
+async def get_summary(client, messages):
+ print("\nSending messages to LLM via ChatCompletionClient...")
+ try:
+ # The core call: send messages, get structured result
+ response: CreateResult = await client.create(
+ messages=messages,
+ # We aren't providing tools in this simple example
+ tools=[]
+ )
+ print("Received response:")
+ print(f"- Finish Reason: {response.finish_reason}")
+ print(f"- Content: {response.content}") # This should be the summary
+ print(f"- Usage (Tokens): Prompt={response.usage.prompt_tokens}, Completion={response.usage.completion_tokens}")
+ print(f"- Cached: {response.cached}")
+
+ # Also, check total usage tracked by the client
+ total_usage = client.total_usage()
+ print(f"\nClient Total Usage: Prompt={total_usage.prompt_tokens}, Completion={total_usage.completion_tokens}")
+
+ except Exception as e:
+ print(f"An error occurred: {e}")
+
+# --- Placeholder for actual client ---
+class MockChatCompletionClient: # Simulate a real client
+ _total_usage = RequestUsage(prompt_tokens=0, completion_tokens=0)
+ async def create(self, messages, tools=[], **kwargs) -> CreateResult:
+ # Simulate API call and response
+ prompt_len = sum(len(str(m.content)) for m in messages) // 4 # Rough token estimate
+ summary = "AutoGen is a multi-agent framework for developing LLM applications."
+ completion_len = len(summary) // 4 # Rough token estimate
+ usage = RequestUsage(prompt_tokens=prompt_len, completion_tokens=completion_len)
+ self._total_usage.prompt_tokens += usage.prompt_tokens
+ self._total_usage.completion_tokens += usage.completion_tokens
+ return CreateResult(
+ finish_reason="stop", content=summary, usage=usage, cached=False
+ )
+ def total_usage(self) -> RequestUsage: return self._total_usage
+ # Other required methods (count_tokens, model_info etc.) omitted for brevity
+
+async def main():
+ from prepare_messages import messages_to_send # Get messages from previous step
+ mock_client = MockChatCompletionClient()
+ await get_summary(mock_client, messages_to_send)
+
+# asyncio.run(main()) # If you run this, it uses the mock client
+```
+This code shows the essential `client.create(...)` call. We pass our `messages_to_send` and receive a `CreateResult`. We then print the summary (`response.content`) and the token usage reported for that specific call (`response.usage`) and the total tracked by the client (`client.total_usage()`).
+
+**How an Agent Uses It:**
+Typically, an agent's logic (e.g., inside its `on_message` handler) would:
+1. Receive an incoming message (like the article to summarize).
+2. Prepare the list of `LLMMessage` objects (including system prompts, history, and the new request).
+3. Access a `ChatCompletionClient` instance (often provided during agent setup or accessed via its context).
+4. Call `await client.create(...)`.
+5. Process the `CreateResult` (e.g., extract the summary text, check for function calls if tools were provided).
+6. Potentially send the result as a new message to another agent or return it.
+
+## Under the Hood: How the Client Talks to the LLM
+
+What happens when you call `await client.create(...)`?
+
+**Conceptual Flow:**
+
+```mermaid
+sequenceDiagram
+ participant Agent as Agent Logic
+ participant Client as ChatCompletionClient
+ participant Formatter as API Formatter
+ participant HTTP as HTTP Client
+ participant LLM_API as External LLM API
+
+ Agent->>+Client: create(messages, tools)
+ Client->>+Formatter: Format messages & tools for specific API (e.g., OpenAI JSON format)
+ Formatter-->>-Client: Return formatted request body
+ Client->>+HTTP: Send POST request to LLM API endpoint with formatted body & API Key
+ HTTP->>+LLM_API: Transmit request over network
+ LLM_API->>LLM_API: Process request, generate completion/function call
+ LLM_API-->>-HTTP: Return API response (e.g., JSON)
+ HTTP-->>-Client: Receive HTTP response
+ Client->>+Formatter: Parse API response (extract content, usage, finish_reason)
+ Formatter-->>-Client: Return parsed data
+ Client->>Client: Create standard CreateResult object
+ Client-->>-Agent: Return CreateResult
+```
+
+1. **Prepare:** The `ChatCompletionClient` takes the standard `LLMMessage` list and `ToolSchema` list.
+2. **Format:** It translates these into the specific format required by the target LLM's API (e.g., the JSON structure expected by OpenAI's `/chat/completions` endpoint). This might involve renaming roles (like `SystemMessage` to `system`), formatting tool descriptions, etc.
+3. **Request:** It uses an underlying HTTP client to send a network request (usually a POST request) to the LLM service's API endpoint, including the formatted data and authentication (like an API key).
+4. **Wait & Receive:** It waits for the LLM service to process the request and send back a response over the network.
+5. **Parse:** It receives the raw HTTP response (usually JSON) from the API.
+6. **Standardize:** It parses this specific API response, extracting the generated text or function calls, token usage figures, finish reason, etc.
+7. **Return:** It packages all this information into a standard `CreateResult` object and returns it to the calling agent code.
+
+**Code Glimpse:**
+
+* **`ChatCompletionClient` Protocol (`models/_model_client.py`):** This is the abstract base class (or protocol) defining the *contract* that all specific clients must follow.
+
+ ```python
+ # From: models/_model_client.py (Simplified ABC)
+ from abc import ABC, abstractmethod
+ from typing import Sequence, Optional, Mapping, Any, AsyncGenerator, Union
+ from ._types import LLMMessage, CreateResult, RequestUsage
+ from ..tools import Tool, ToolSchema
+ from .. import CancellationToken
+
+ class ChatCompletionClient(ABC):
+ @abstractmethod
+ async def create(
+ self, messages: Sequence[LLMMessage], *,
+ tools: Sequence[Tool | ToolSchema] = [],
+ json_output: Optional[bool] = None, # Hint for JSON mode
+ extra_create_args: Mapping[str, Any] = {}, # API-specific args
+ cancellation_token: Optional[CancellationToken] = None,
+ ) -> CreateResult: ... # The core method
+
+ @abstractmethod
+ def create_stream(
+ self, # Similar to create, but yields results incrementally
+ # ... parameters ...
+ ) -> AsyncGenerator[Union[str, CreateResult], None]: ...
+
+ @abstractmethod
+ def total_usage(self) -> RequestUsage: ... # Get total tracked usage
+
+ @abstractmethod
+ def count_tokens(self, messages: Sequence[LLMMessage], *, tools: Sequence[Tool | ToolSchema] = []) -> int: ... # Estimate token count
+
+ # Other methods like close(), actual_usage(), remaining_tokens(), model_info...
+ ```
+ Concrete classes like `OpenAIChatCompletionClient`, `AnthropicChatCompletionClient` etc., implement these methods using the specific libraries and API calls for each service.
+
+* **`LLMMessage` Types (`models/_types.py`):** These define the structure of messages passed *to* the client.
+
+ ```python
+ # From: models/_types.py (Simplified)
+ from pydantic import BaseModel
+ from typing import List, Union, Literal
+ from .. import FunctionCall # From Chapter 4 context
+
+ class SystemMessage(BaseModel):
+ content: str
+ type: Literal["SystemMessage"] = "SystemMessage"
+
+ class UserMessage(BaseModel):
+ content: Union[str, List[Union[str, Image]]] # Can include images!
+ source: str
+ type: Literal["UserMessage"] = "UserMessage"
+
+ class AssistantMessage(BaseModel):
+ content: Union[str, List[FunctionCall]] # Can be text or function calls
+ source: str
+ type: Literal["AssistantMessage"] = "AssistantMessage"
+
+ # FunctionExecutionResultMessage also exists here...
+ ```
+
+* **`CreateResult` (`models/_types.py`):** This defines the structure of the response *from* the client.
+
+ ```python
+ # From: models/_types.py (Simplified)
+ from pydantic import BaseModel
+ from dataclasses import dataclass
+ from typing import Union, List, Optional
+ from .. import FunctionCall
+
+ @dataclass
+ class RequestUsage:
+ prompt_tokens: int
+ completion_tokens: int
+
+ FinishReasons = Literal["stop", "length", "function_calls", "content_filter", "unknown"]
+
+ class CreateResult(BaseModel):
+ finish_reason: FinishReasons
+ content: Union[str, List[FunctionCall]] # LLM output
+ usage: RequestUsage # Token usage for this call
+ cached: bool
+ # Optional fields like logprobs, thought...
+ ```
+ Using these standard types ensures that agent logic can work consistently, even if you switch the underlying LLM service by using a different `ChatCompletionClient` implementation.
+
+## Next Steps
+
+You now understand the role of `ChatCompletionClient` as the crucial link between AutoGen agents and the powerful capabilities of Large Language Models. It provides a standard way to send conversational history and tool definitions, receive generated text or function call requests, and track token usage.
+
+Managing the conversation history (`messages`) sent to the client is very important. How do you ensure the LLM has the right context, especially after tool calls have happened?
+
+* [Chapter 6: ChatCompletionContext](06_chatcompletioncontext.md): Learn how AutoGen helps manage the conversation history, including adding tool call requests and their results, before sending it to the `ChatCompletionClient`.
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/AutoGen Core/06_chatcompletioncontext.md b/output/AutoGen Core/06_chatcompletioncontext.md
new file mode 100644
index 0000000..3b04b3d
--- /dev/null
+++ b/output/AutoGen Core/06_chatcompletioncontext.md
@@ -0,0 +1,330 @@
+# Chapter 6: ChatCompletionContext - Remembering the Conversation
+
+In [Chapter 5: ChatCompletionClient](05_chatcompletionclient.md), we learned how agents talk to Large Language Models (LLMs) using a `ChatCompletionClient`. We saw that we need to send a list of `messages` (the conversation history) to the LLM so it knows the context.
+
+But conversations can get very long! Imagine talking on the phone for an hour. Can you remember *every single word* that was said? Probably not. You remember the main points, the beginning, and what was said most recently. LLMs have a similar limitation โ they can only pay attention to a certain amount of text at once (called the "context window").
+
+If we send the *entire* history of a very long chat, it might be too much for the LLM, lead to errors, be slow, or cost more money (since many LLMs charge based on the amount of text).
+
+So, how do we smartly choose *which* parts of the conversation history to send? This is the problem that **`ChatCompletionContext`** solves.
+
+## Motivation: Keeping LLM Conversations Focused
+
+Let's say we have a helpful assistant agent chatting with a user:
+
+1. **User:** "Hi! Can you tell me about AutoGen?"
+2. **Assistant:** "Sure! AutoGen is a framework..." (provides details)
+3. **User:** "Thanks! Now, can you draft an email to my team about our upcoming meeting?"
+4. **Assistant:** "Okay, what's the meeting about?"
+5. **User:** "It's about the project planning for Q3."
+6. **Assistant:** (Needs to draft the email)
+
+When the Assistant needs to draft the email (step 6), does it need the *exact* text from step 2 about what AutoGen is? Probably not. It definitely needs the instructions from step 3 and the topic from step 5. Maybe the initial greeting isn't super important either.
+
+`ChatCompletionContext` acts like a **smart transcript editor**. Before sending the history to the LLM via the `ChatCompletionClient`, it reviews the full conversation log and prepares a shorter, focused version containing only the messages it thinks are most relevant for the LLM's next response.
+
+## Key Concepts: Managing the Chat History
+
+1. **The Full Transcript Holder:** A `ChatCompletionContext` object holds the *complete* list of messages (`LLMMessage` objects like `SystemMessage`, `UserMessage`, `AssistantMessage` from Chapter 5) that have occurred in a specific conversation thread. You add new messages using its `add_message` method.
+
+2. **The Smart View Generator (`get_messages`):** The core job of `ChatCompletionContext` is done by its `get_messages` method. When called, it looks at the *full* transcript it holds, but returns only a *subset* of those messages based on its specific strategy. This subset is what you'll actually send to the `ChatCompletionClient`.
+
+3. **Different Strategies for Remembering:** Because different situations require different focus, AutoGen Core provides several `ChatCompletionContext` implementations (strategies):
+ * **`UnboundedChatCompletionContext`:** The simplest (and sometimes riskiest!). It doesn't edit anything; `get_messages` just returns the *entire* history. Good for short chats, but can break with long ones.
+ * **`BufferedChatCompletionContext`:** Like remembering only the last few things someone said. It keeps the most recent `N` messages (where `N` is the `buffer_size` you set). Good for focusing on recent interactions.
+ * **`HeadAndTailChatCompletionContext`:** Tries to get the best of both worlds. It keeps the first few messages (the "head", maybe containing initial instructions) and the last few messages (the "tail", the recent context). It skips the messages in the middle.
+
+## Use Case Example: Chatting with Different Memory Strategies
+
+Let's simulate adding messages to different context managers and see what `get_messages` returns.
+
+**Step 1: Define some messages**
+
+```python
+# File: define_chat_messages.py
+from autogen_core.models import (
+ SystemMessage, UserMessage, AssistantMessage, LLMMessage
+)
+from typing import List
+
+# The initial instruction for the assistant
+system_msg = SystemMessage(content="You are a helpful assistant.")
+
+# A sequence of user/assistant turns
+chat_sequence: List[LLMMessage] = [
+ UserMessage(content="What is AutoGen?", source="User"),
+ AssistantMessage(content="AutoGen is a multi-agent framework...", source="Agent"),
+ UserMessage(content="What can it do?", source="User"),
+ AssistantMessage(content="It can build complex LLM apps.", source="Agent"),
+ UserMessage(content="Thanks!", source="User")
+]
+
+# Combine system message and the chat sequence
+full_history: List[LLMMessage] = [system_msg] + chat_sequence
+
+print(f"Total messages in full history: {len(full_history)}")
+# Output: Total messages in full history: 6
+```
+We have a full history of 6 messages (1 system + 5 chat turns).
+
+**Step 2: Use `UnboundedChatCompletionContext`**
+
+This context keeps everything.
+
+```python
+# File: use_unbounded_context.py
+import asyncio
+from define_chat_messages import full_history
+from autogen_core.model_context import UnboundedChatCompletionContext
+
+async def main():
+ # Create context and add all messages
+ context = UnboundedChatCompletionContext()
+ for msg in full_history:
+ await context.add_message(msg)
+
+ # Get the messages to send to the LLM
+ messages_for_llm = await context.get_messages()
+
+ print(f"--- Unbounded Context ({len(messages_for_llm)} messages) ---")
+ for i, msg in enumerate(messages_for_llm):
+ print(f"{i+1}. [{msg.type}]: {msg.content[:30]}...")
+
+# asyncio.run(main()) # If run
+```
+
+**Expected Output (Unbounded):**
+```
+--- Unbounded Context (6 messages) ---
+1. [SystemMessage]: You are a helpful assistant....
+2. [UserMessage]: What is AutoGen?...
+3. [AssistantMessage]: AutoGen is a multi-agent fram...
+4. [UserMessage]: What can it do?...
+5. [AssistantMessage]: It can build complex LLM apps...
+6. [UserMessage]: Thanks!...
+```
+It returns all 6 messages, exactly as added.
+
+**Step 3: Use `BufferedChatCompletionContext`**
+
+Let's keep only the last 3 messages.
+
+```python
+# File: use_buffered_context.py
+import asyncio
+from define_chat_messages import full_history
+from autogen_core.model_context import BufferedChatCompletionContext
+
+async def main():
+ # Keep only the last 3 messages
+ context = BufferedChatCompletionContext(buffer_size=3)
+ for msg in full_history:
+ await context.add_message(msg)
+
+ messages_for_llm = await context.get_messages()
+
+ print(f"--- Buffered Context (buffer=3, {len(messages_for_llm)} messages) ---")
+ for i, msg in enumerate(messages_for_llm):
+ print(f"{i+1}. [{msg.type}]: {msg.content[:30]}...")
+
+# asyncio.run(main()) # If run
+```
+
+**Expected Output (Buffered):**
+```
+--- Buffered Context (buffer=3, 3 messages) ---
+1. [UserMessage]: What can it do?...
+2. [AssistantMessage]: It can build complex LLM apps...
+3. [UserMessage]: Thanks!...
+```
+It only returns the last 3 messages from the full history. The system message and the first chat turn are omitted.
+
+**Step 4: Use `HeadAndTailChatCompletionContext`**
+
+Let's keep the first message (head=1) and the last two messages (tail=2).
+
+```python
+# File: use_head_tail_context.py
+import asyncio
+from define_chat_messages import full_history
+from autogen_core.model_context import HeadAndTailChatCompletionContext
+
+async def main():
+ # Keep first 1 and last 2 messages
+ context = HeadAndTailChatCompletionContext(head_size=1, tail_size=2)
+ for msg in full_history:
+ await context.add_message(msg)
+
+ messages_for_llm = await context.get_messages()
+
+ print(f"--- Head & Tail Context (h=1, t=2, {len(messages_for_llm)} messages) ---")
+ for i, msg in enumerate(messages_for_llm):
+ print(f"{i+1}. [{msg.type}]: {msg.content[:30]}...")
+
+# asyncio.run(main()) # If run
+```
+
+**Expected Output (Head & Tail):**
+```
+--- Head & Tail Context (h=1, t=2, 4 messages) ---
+1. [SystemMessage]: You are a helpful assistant....
+2. [UserMessage]: Skipped 3 messages....
+3. [AssistantMessage]: It can build complex LLM apps...
+4. [UserMessage]: Thanks!...
+```
+It keeps the very first message (`SystemMessage`), then inserts a placeholder telling the LLM that some messages were skipped, and finally includes the last two messages. This preserves the initial instruction and the most recent context.
+
+**Which one to choose?** It depends on your agent's task!
+* Simple Q&A? `Buffered` might be fine.
+* Following complex initial instructions? `HeadAndTail` or even `Unbounded` (if short) might be better.
+
+## Under the Hood: How Context is Managed
+
+The core idea is defined by the `ChatCompletionContext` abstract base class.
+
+**Conceptual Flow:**
+
+```mermaid
+sequenceDiagram
+ participant Agent as Agent Logic
+ participant Context as ChatCompletionContext
+ participant FullHistory as Internal Message List
+
+ Agent->>+Context: add_message(newMessage)
+ Context->>+FullHistory: Append newMessage to list
+ FullHistory-->>-Context: List updated
+ Context-->>-Agent: Done
+
+ Agent->>+Context: get_messages()
+ Context->>+FullHistory: Read the full list
+ FullHistory-->>-Context: Return full list
+ Context->>Context: Apply Strategy (e.g., slice list for Buffered/HeadTail)
+ Context-->>-Agent: Return selected list of messages
+```
+
+1. **Adding:** When `add_message(message)` is called, the context simply appends the `message` to its internal list (`self._messages`).
+2. **Getting:** When `get_messages()` is called:
+ * The context accesses its internal `self._messages` list.
+ * The specific implementation (`Unbounded`, `Buffered`, `HeadAndTail`) applies its logic to select which messages to return.
+ * It returns the selected list.
+
+**Code Glimpse:**
+
+* **Base Class (`_chat_completion_context.py`):** Defines the structure and common methods.
+
+ ```python
+ # From: model_context/_chat_completion_context.py (Simplified)
+ from abc import ABC, abstractmethod
+ from typing import List
+ from ..models import LLMMessage
+
+ class ChatCompletionContext(ABC):
+ component_type = "chat_completion_context" # Identifies this as a component type
+
+ def __init__(self, initial_messages: List[LLMMessage] | None = None) -> None:
+ # Holds the COMPLETE history
+ self._messages: List[LLMMessage] = initial_messages or []
+
+ async def add_message(self, message: LLMMessage) -> None:
+ """Add a message to the full context."""
+ self._messages.append(message)
+
+ @abstractmethod
+ async def get_messages(self) -> List[LLMMessage]:
+ """Get the subset of messages based on the strategy."""
+ # Each subclass MUST implement this logic
+ ...
+
+ # Other methods like clear(), save_state(), load_state() exist too
+ ```
+ The base class handles storing messages; subclasses define *how* to retrieve them.
+
+* **Unbounded (`_unbounded_chat_completion_context.py`):** The simplest implementation.
+
+ ```python
+ # From: model_context/_unbounded_chat_completion_context.py (Simplified)
+ from typing import List
+ from ._chat_completion_context import ChatCompletionContext
+ from ..models import LLMMessage
+
+ class UnboundedChatCompletionContext(ChatCompletionContext):
+ async def get_messages(self) -> List[LLMMessage]:
+ """Returns all messages."""
+ return self._messages # Just return the whole internal list
+ ```
+
+* **Buffered (`_buffered_chat_completion_context.py`):** Uses slicing to get the end of the list.
+
+ ```python
+ # From: model_context/_buffered_chat_completion_context.py (Simplified)
+ from typing import List
+ from ._chat_completion_context import ChatCompletionContext
+ from ..models import LLMMessage, FunctionExecutionResultMessage
+
+ class BufferedChatCompletionContext(ChatCompletionContext):
+ def __init__(self, buffer_size: int, ...):
+ super().__init__(...)
+ self._buffer_size = buffer_size
+
+ async def get_messages(self) -> List[LLMMessage]:
+ """Get at most `buffer_size` recent messages."""
+ # Slice the list to get the last 'buffer_size' items
+ messages = self._messages[-self._buffer_size :]
+ # Special case: Avoid starting with a function result message
+ if messages and isinstance(messages[0], FunctionExecutionResultMessage):
+ messages = messages[1:]
+ return messages
+ ```
+
+* **Head and Tail (`_head_and_tail_chat_completion_context.py`):** Combines slices from the beginning and end.
+
+ ```python
+ # From: model_context/_head_and_tail_chat_completion_context.py (Simplified)
+ from typing import List
+ from ._chat_completion_context import ChatCompletionContext
+ from ..models import LLMMessage, UserMessage
+
+ class HeadAndTailChatCompletionContext(ChatCompletionContext):
+ def __init__(self, head_size: int, tail_size: int, ...):
+ super().__init__(...)
+ self._head_size = head_size
+ self._tail_size = tail_size
+
+ async def get_messages(self) -> List[LLMMessage]:
+ head = self._messages[: self._head_size] # First 'head_size' items
+ tail = self._messages[-self._tail_size :] # Last 'tail_size' items
+ num_skipped = len(self._messages) - len(head) - len(tail)
+
+ if num_skipped <= 0: # If no overlap or gap
+ return self._messages
+ else: # If messages were skipped
+ placeholder = [UserMessage(content=f"Skipped {num_skipped} messages.", source="System")]
+ # Combine head + placeholder + tail
+ return head + placeholder + tail
+ ```
+ These implementations provide different ways to manage the context window effectively.
+
+## Putting it Together with ChatCompletionClient
+
+How does an agent use `ChatCompletionContext` with the `ChatCompletionClient` from Chapter 5?
+
+1. An agent has an instance of a `ChatCompletionContext` (e.g., `BufferedChatCompletionContext`) to store its conversation history.
+2. When the agent receives a new message (e.g., a `UserMessage`), it calls `await context.add_message(new_user_message)`.
+3. To prepare for calling the LLM, the agent calls `messages_to_send = await context.get_messages()`. This gets the strategically selected subset of the history.
+4. The agent then passes this list to the `ChatCompletionClient`: `response = await llm_client.create(messages=messages_to_send, ...)`.
+5. When the LLM replies (e.g., with an `AssistantMessage`), the agent adds it back to the context: `await context.add_message(llm_response_message)`.
+
+This loop ensures that the history is continuously updated and intelligently trimmed before each call to the LLM.
+
+## Next Steps
+
+You've learned how `ChatCompletionContext` helps manage the conversation history sent to LLMs, preventing context window overflows and keeping the interaction focused using different strategies (`Unbounded`, `Buffered`, `HeadAndTail`).
+
+This context management is a specific form of **memory**. Agents might need to remember things beyond just the chat history. How do they store general information, state, or knowledge over time?
+
+* [Chapter 7: Memory](07_memory.md): Explore the broader concept of Memory in AutoGen Core, which provides more general ways for agents to store and retrieve information.
+* [Chapter 8: Component](08_component.md): Understand how `ChatCompletionContext` fits into the general `Component` model, allowing configuration and integration within the AutoGen system.
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/AutoGen Core/07_memory.md b/output/AutoGen Core/07_memory.md
new file mode 100644
index 0000000..8e648b5
--- /dev/null
+++ b/output/AutoGen Core/07_memory.md
@@ -0,0 +1,323 @@
+# Chapter 7: Memory - The Agent's Notebook
+
+In [Chapter 6: ChatCompletionContext](06_chatcompletioncontext.md), we saw how agents manage the *short-term* history of a single conversation before talking to an LLM. It's like remembering what was just said in the last few minutes.
+
+But what if an agent needs to remember things for much longer, across *multiple* conversations or tasks? For example, imagine an assistant agent that learns your preferences:
+* You tell it: "Please always write emails in a formal style for me."
+* Weeks later, you ask it to draft a new email.
+
+How does it remember that preference? The short-term `ChatCompletionContext` might have forgotten the earlier instruction, especially if using a strategy like `BufferedChatCompletionContext`. The agent needs a **long-term memory**.
+
+This is where the **`Memory`** abstraction comes in. Think of it as the agent's **long-term notebook or database**. While `ChatCompletionContext` is the scratchpad for the current chat, `Memory` holds persistent information the agent can add to or look up later.
+
+## Motivation: Remembering Across Conversations
+
+Our goal is to give an agent the ability to store a piece of information (like a user preference) and retrieve it later to influence its behavior, even in a completely new conversation. `Memory` provides the mechanism for this long-term storage and retrieval.
+
+## Key Concepts: How the Notebook Works
+
+1. **What it Stores (`MemoryContent`):** Agents can store various types of information in their memory. This could be:
+ * Plain text notes (`text/plain`)
+ * Structured data like JSON (`application/json`)
+ * Even images (`image/*`)
+ Each piece of information is wrapped in a `MemoryContent` object, which includes the data itself, its type (`mime_type`), and optional descriptive `metadata`.
+
+ ```python
+ # From: memory/_base_memory.py (Simplified Concept)
+ from pydantic import BaseModel
+ from typing import Any, Dict, Union
+
+ # Represents one entry in the memory notebook
+ class MemoryContent(BaseModel):
+ content: Union[str, bytes, Dict[str, Any]] # The actual data
+ mime_type: str # What kind of data (e.g., "text/plain")
+ metadata: Dict[str, Any] | None = None # Extra info (optional)
+ ```
+ This standard format helps manage different kinds of memories.
+
+2. **Adding to Memory (`add`):** When an agent learns something important it wants to remember long-term (like the user's preferred style), it uses the `memory.add(content)` method. This is like writing a new entry in the notebook.
+
+3. **Querying Memory (`query`):** When an agent needs to recall information, it can use `memory.query(query_text)`. This is like searching the notebook for relevant entries. How the search works depends on the specific memory implementation (it could be a simple text match, or a sophisticated vector search in more advanced memories).
+
+4. **Updating Chat Context (`update_context`):** This is a crucial link! Before an agent talks to the LLM (using the `ChatCompletionClient` from [Chapter 5](05_chatcompletionclient.md)), it can use `memory.update_context(chat_context)` method. This method:
+ * Looks at the current conversation (`chat_context`).
+ * Queries the long-term memory (`Memory`) for relevant information.
+ * Injects the retrieved memories *into* the `chat_context`, often as a `SystemMessage`.
+ This way, the LLM gets the benefit of the long-term memory *in addition* to the short-term conversation history, right before generating its response.
+
+5. **Different Memory Implementations:** Just like there are different `ChatCompletionContext` strategies, there can be different `Memory` implementations:
+ * `ListMemory`: A very simple memory that stores everything in a Python list (like a simple chronological notebook).
+ * *Future Possibilities*: More advanced implementations could use databases or vector stores for more efficient storage and retrieval of vast amounts of information.
+
+## Use Case Example: Remembering User Preferences with `ListMemory`
+
+Let's implement our user preference use case using the simple `ListMemory`.
+
+**Goal:**
+1. Create a `ListMemory`.
+2. Add a user preference ("formal style") to it.
+3. Start a *new* chat context.
+4. Use `update_context` to inject the preference into the new chat context.
+5. Show how the chat context looks *before* being sent to the LLM.
+
+**Step 1: Create the Memory**
+
+We'll use `ListMemory`, the simplest implementation provided by AutoGen Core.
+
+```python
+# File: create_list_memory.py
+from autogen_core.memory import ListMemory
+
+# Create a simple list-based memory instance
+user_prefs_memory = ListMemory(name="user_preferences")
+
+print(f"Created memory: {user_prefs_memory.name}")
+print(f"Initial content: {user_prefs_memory.content}")
+# Output:
+# Created memory: user_preferences
+# Initial content: []
+```
+We have an empty memory notebook named "user_preferences".
+
+**Step 2: Add the Preference**
+
+Let's add the user's preference as a piece of text memory.
+
+```python
+# File: add_preference.py
+import asyncio
+from autogen_core.memory import MemoryContent
+# Assume user_prefs_memory exists from the previous step
+
+# Define the preference as MemoryContent
+preference = MemoryContent(
+ content="User prefers all communication to be written in a formal style.",
+ mime_type="text/plain", # It's just text
+ metadata={"source": "user_instruction_conversation_1"} # Optional info
+)
+
+async def add_to_memory():
+ # Add the content to our memory instance
+ await user_prefs_memory.add(preference)
+ print(f"Memory content after adding: {user_prefs_memory.content}")
+
+asyncio.run(add_to_memory())
+# Output (will show the MemoryContent object):
+# Memory content after adding: [MemoryContent(content='User prefers...', mime_type='text/plain', metadata={'source': '...'})]
+```
+We've successfully written the preference into our `ListMemory` notebook.
+
+**Step 3: Start a New Chat Context**
+
+Imagine time passes, and the user starts a new conversation asking for an email draft. We create a fresh `ChatCompletionContext`.
+
+```python
+# File: start_new_chat.py
+from autogen_core.model_context import UnboundedChatCompletionContext
+from autogen_core.models import UserMessage
+
+# Start a new, empty chat context for a new task
+new_chat_context = UnboundedChatCompletionContext()
+
+# Add the user's new request
+new_request = UserMessage(content="Draft an email to the team about the Q3 results.", source="User")
+# await new_chat_context.add_message(new_request) # In a real app, add the request
+
+print("Created a new, empty chat context.")
+# Output: Created a new, empty chat context.
+```
+This context currently *doesn't* know about the "formal style" preference stored in our long-term memory.
+
+**Step 4: Inject Memory into Chat Context**
+
+Before sending the `new_chat_context` to the LLM, we use `update_context` to bring in relevant long-term memories.
+
+```python
+# File: update_chat_with_memory.py
+import asyncio
+# Assume user_prefs_memory exists (with the preference added)
+# Assume new_chat_context exists (empty or with just the new request)
+# Assume new_request exists
+
+async def main():
+ # --- This is where Memory connects to Chat Context ---
+ print("Updating chat context with memory...")
+ update_result = await user_prefs_memory.update_context(new_chat_context)
+ print(f"Memories injected: {len(update_result.memories.results)}")
+
+ # Now let's add the actual user request for this task
+ await new_chat_context.add_message(new_request)
+
+ # See what messages are now in the context
+ messages_for_llm = await new_chat_context.get_messages()
+ print("\nMessages to be sent to LLM:")
+ for msg in messages_for_llm:
+ print(f"- [{msg.type}]: {msg.content}")
+
+asyncio.run(main())
+```
+
+**Expected Output:**
+```
+Updating chat context with memory...
+Memories injected: 1
+
+Messages to be sent to LLM:
+- [SystemMessage]:
+Relevant memory content (in chronological order):
+1. User prefers all communication to be written in a formal style.
+
+- [UserMessage]: Draft an email to the team about the Q3 results.
+```
+Look! The `ListMemory.update_context` method automatically queried the memory (in this simple case, it just takes *all* entries) and added a `SystemMessage` to the `new_chat_context`. This message explicitly tells the LLM about the stored preference *before* it sees the user's request to draft the email.
+
+**Step 5: (Conceptual) Sending to LLM**
+
+Now, if we were to send `messages_for_llm` to the `ChatCompletionClient` (Chapter 5):
+
+```python
+# Conceptual code - Requires a configured client
+# response = await llm_client.create(messages=messages_for_llm)
+```
+The LLM would receive both the instruction about the formal style preference (from Memory) and the request to draft the email. It's much more likely to follow the preference now!
+
+**Step 6: Direct Query (Optional)**
+
+We can also directly query the memory if needed, without involving a chat context.
+
+```python
+# File: query_memory.py
+import asyncio
+# Assume user_prefs_memory exists
+
+async def main():
+ # Query the memory (ListMemory returns all items regardless of query text)
+ query_result = await user_prefs_memory.query("style preference")
+ print("\nDirect query result:")
+ for item in query_result.results:
+ print(f"- Content: {item.content}, Type: {item.mime_type}")
+
+asyncio.run(main())
+# Output:
+# Direct query result:
+# - Content: User prefers all communication to be written in a formal style., Type: text/plain
+```
+This shows how an agent could specifically look things up in its notebook.
+
+## Under the Hood: How `ListMemory` Injects Context
+
+Let's trace the `update_context` call for `ListMemory`.
+
+**Conceptual Flow:**
+
+```mermaid
+sequenceDiagram
+ participant AgentLogic as Agent Logic
+ participant ListMem as ListMemory
+ participant InternalList as Memory's Internal List
+ participant ChatCtx as ChatCompletionContext
+
+ AgentLogic->>+ListMem: update_context(chat_context)
+ ListMem->>+InternalList: Get all stored MemoryContent items
+ InternalList-->>-ListMem: Return list of [pref_content]
+ alt Memory list is NOT empty
+ ListMem->>ListMem: Format memories into a single string (e.g., "1. pref_content")
+ ListMem->>ListMem: Create SystemMessage with formatted string
+ ListMem->>+ChatCtx: add_message(SystemMessage)
+ ChatCtx-->>-ListMem: Context updated
+ end
+ ListMem->>ListMem: Create UpdateContextResult(memories=[pref_content])
+ ListMem-->>-AgentLogic: Return UpdateContextResult
+```
+
+1. The agent calls `user_prefs_memory.update_context(new_chat_context)`.
+2. The `ListMemory` instance accesses its internal `_contents` list.
+3. It checks if the list is empty. If not:
+4. It iterates through the `MemoryContent` items in the list.
+5. It formats them into a numbered string (like "Relevant memory content...\n1. Item 1\n2. Item 2...").
+6. It creates a single `SystemMessage` containing this formatted string.
+7. It calls `new_chat_context.add_message()` to add this `SystemMessage` to the chat history that will be sent to the LLM.
+8. It returns an `UpdateContextResult` containing the list of memories it just processed.
+
+**Code Glimpse:**
+
+* **`Memory` Protocol (`memory/_base_memory.py`):** Defines the required methods for any memory implementation.
+
+ ```python
+ # From: memory/_base_memory.py (Simplified ABC)
+ from abc import ABC, abstractmethod
+ # ... other imports: MemoryContent, MemoryQueryResult, UpdateContextResult, ChatCompletionContext
+
+ class Memory(ABC):
+ component_type = "memory"
+
+ @abstractmethod
+ async def update_context(self, model_context: ChatCompletionContext) -> UpdateContextResult: ...
+
+ @abstractmethod
+ async def query(self, query: str | MemoryContent, ...) -> MemoryQueryResult: ...
+
+ @abstractmethod
+ async def add(self, content: MemoryContent, ...) -> None: ...
+
+ @abstractmethod
+ async def clear(self) -> None: ...
+
+ @abstractmethod
+ async def close(self) -> None: ...
+ ```
+ Any class wanting to act as Memory must provide these methods.
+
+* **`ListMemory` Implementation (`memory/_list_memory.py`):**
+
+ ```python
+ # From: memory/_list_memory.py (Simplified)
+ from typing import List
+ # ... other imports: Memory, MemoryContent, ..., SystemMessage, ChatCompletionContext
+
+ class ListMemory(Memory):
+ def __init__(self, ..., memory_contents: List[MemoryContent] | None = None):
+ # Stores memory items in a simple list
+ self._contents: List[MemoryContent] = memory_contents or []
+
+ async def add(self, content: MemoryContent, ...) -> None:
+ """Add new content to the internal list."""
+ self._contents.append(content)
+
+ async def query(self, query: str | MemoryContent = "", ...) -> MemoryQueryResult:
+ """Return all memories, ignoring the query."""
+ # Simple implementation: just return everything
+ return MemoryQueryResult(results=self._contents)
+
+ async def update_context(self, model_context: ChatCompletionContext) -> UpdateContextResult:
+ """Add all memories as a SystemMessage to the chat context."""
+ if not self._contents: # Do nothing if memory is empty
+ return UpdateContextResult(memories=MemoryQueryResult(results=[]))
+
+ # Format all memories into a numbered list string
+ memory_strings = [f"{i}. {str(mem.content)}" for i, mem in enumerate(self._contents, 1)]
+ memory_context_str = "Relevant memory content...\n" + "\n".join(memory_strings) + "\n"
+
+ # Add this string as a SystemMessage to the provided chat context
+ await model_context.add_message(SystemMessage(content=memory_context_str))
+
+ # Return info about which memories were added
+ return UpdateContextResult(memories=MemoryQueryResult(results=self._contents))
+
+ # ... clear(), close(), config methods ...
+ ```
+ This shows the straightforward logic of `ListMemory`: store in a list, retrieve the whole list, and inject the whole list as a single system message into the chat context. More complex memories might use smarter retrieval (e.g., based on the `query` in `query()` or the last message in `update_context`) and inject memories differently.
+
+## Next Steps
+
+You've learned about `Memory`, AutoGen Core's mechanism for giving agents long-term recall beyond the immediate conversation (`ChatCompletionContext`). We saw how `MemoryContent` holds information, `add` stores it, `query` retrieves it, and `update_context` injects relevant memories into the LLM's working context. We explored the simple `ListMemory` as a basic example.
+
+Memory systems are crucial for agents that learn, adapt, or need to maintain state across interactions.
+
+This concludes our deep dive into the core abstractions of AutoGen Core! We've covered Agents, Messaging, Runtime, Tools, LLM Clients, Chat Context, and now Memory. There's one final concept that ties many of these together from a configuration perspective:
+
+* [Chapter 8: Component](08_component.md): Understand the general `Component` model in AutoGen Core, how it allows pieces like `Memory`, `ChatCompletionContext`, and `ChatCompletionClient` to be configured and managed consistently.
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/AutoGen Core/08_component.md b/output/AutoGen Core/08_component.md
new file mode 100644
index 0000000..65d019a
--- /dev/null
+++ b/output/AutoGen Core/08_component.md
@@ -0,0 +1,359 @@
+# Chapter 8: Component - The Standardized Building Blocks
+
+Welcome to Chapter 8! In our journey so far, we've met several key players in AutoGen Core:
+* [Agents](01_agent.md): The workers.
+* [Messaging System](02_messaging_system__topic___subscription_.md): How they communicate.
+* [AgentRuntime](03_agentruntime.md): The manager.
+* [Tools](04_tool.md): Their special skills.
+* [ChatCompletionClient](05_chatcompletionclient.md): How they talk to LLMs.
+* [ChatCompletionContext](06_chatcompletioncontext.md): How they remember recent chat history.
+* [Memory](07_memory.md): How they remember things long-term.
+
+Now, imagine you've built a fantastic agent system using these parts. You've configured a specific `ChatCompletionClient` to use OpenAI's `gpt-4o` model, and you've set up a `ListMemory` (from Chapter 7) to store user preferences. How do you save this exact setup so you can easily recreate it later, or share it with a friend? And what if you later want to swap out the `gpt-4o` client for a different one, like Anthropic's Claude, without rewriting your agent's core logic?
+
+This is where the **`Component`** concept comes in. It provides a standard way to define, configure, save, and load these reusable building blocks.
+
+## Motivation: Making Setups Portable and Swappable
+
+Think of the parts we've used so far โ `ChatCompletionClient`, `Memory`, `Tool` โ like specialized **Lego bricks**. Each brick has a specific function (connecting to an LLM, remembering things, performing an action).
+
+Wouldn't it be great if:
+1. Each Lego brick had a standard way to describe its properties (like "Red 2x4 Brick")?
+2. You could easily save the description of all the bricks used in your creation (your agent system)?
+3. Someone else could take that description and automatically rebuild your exact creation?
+4. You could easily swap a "Red 2x4 Brick" for a "Blue 2x4 Brick" without having to rebuild everything around it?
+
+The `Component` abstraction in AutoGen Core provides exactly this! It makes your building blocks **configurable**, **savable**, **loadable**, and **swappable**.
+
+## Key Concepts: Understanding Components
+
+Let's break down what makes the Component system work:
+
+1. **Component:** A class (like `ListMemory` or `OpenAIChatCompletionClient`) that is designed to be a standard, reusable building block. It performs a specific role within the AutoGen ecosystem. Many core classes inherit from `Component` or related base classes.
+
+2. **Configuration (`Config`):** Every Component has specific settings. For example, an `OpenAIChatCompletionClient` needs an API key and a model name. A `ListMemory` might have a name. These settings are defined in a standard way, usually using a Pydantic `BaseModel` specific to that component type. This `Config` acts like the "specification sheet" for the component instance.
+
+3. **Saving Settings (`_to_config` method):** A Component instance knows how to generate its *current* configuration. It has an internal method, `_to_config()`, that returns a `Config` object representing its settings. This is like asking a configured Lego brick, "What color and size are you?"
+
+4. **Loading Settings (`_from_config` class method):** A Component *class* knows how to create a *new* instance of itself from a given configuration. It has a class method, `_from_config(config)`, that takes a `Config` object and builds a new, configured component instance. This is like having instructions: "Build a brick with this color and size."
+
+5. **`ComponentModel` (The Box):** This is the standard package format used to save and load components. It's like the label and instructions on the Lego box. A `ComponentModel` contains:
+ * `provider`: A string telling AutoGen *which* Python class to use (e.g., `"autogen_core.memory.ListMemory"`).
+ * `config`: A dictionary holding the specific settings for this instance (the output of `_to_config()`).
+ * `component_type`: The general role of the component (e.g., `"memory"`, `"model"`, `"tool"`).
+ * Other metadata like `version`, `description`, `label`.
+
+ ```python
+ # From: _component_config.py (Conceptual Structure)
+ from pydantic import BaseModel
+ from typing import Dict, Any
+
+ class ComponentModel(BaseModel):
+ provider: str # Path to the class (e.g., "autogen_core.memory.ListMemory")
+ config: Dict[str, Any] # The specific settings for this instance
+ component_type: str | None = None # Role (e.g., "memory")
+ # ... other fields like version, description, label ...
+ ```
+ This `ComponentModel` is what you typically save to a file (often as JSON or YAML).
+
+## Use Case Example: Saving and Loading `ListMemory`
+
+Let's see how this works with the `ListMemory` we used in [Chapter 7: Memory](07_memory.md).
+
+**Goal:**
+1. Create a `ListMemory` instance.
+2. Save its configuration using the Component system (`dump_component`).
+3. Load that configuration to create a *new*, identical `ListMemory` instance (`load_component`).
+
+**Step 1: Create and Configure a `ListMemory`**
+
+First, let's make a memory component. `ListMemory` is already designed as a Component.
+
+```python
+# File: create_memory_component.py
+import asyncio
+from autogen_core.memory import ListMemory, MemoryContent
+
+# Create an instance of ListMemory
+my_memory = ListMemory(name="user_prefs_v1")
+
+# Add some content (from Chapter 7 example)
+async def add_content():
+ pref = MemoryContent(content="Use formal style", mime_type="text/plain")
+ await my_memory.add(pref)
+ print(f"Created memory '{my_memory.name}' with content: {my_memory.content}")
+
+asyncio.run(add_content())
+# Output: Created memory 'user_prefs_v1' with content: [MemoryContent(content='Use formal style', mime_type='text/plain', metadata=None)]
+```
+We have our configured `my_memory` instance.
+
+**Step 2: Save the Configuration (`dump_component`)**
+
+Now, let's ask this component instance to describe itself by creating a `ComponentModel`.
+
+```python
+# File: save_memory_config.py
+# Assume 'my_memory' exists from the previous step
+
+# Dump the component's configuration into a ComponentModel
+memory_model = my_memory.dump_component()
+
+# Let's print it (converting to dict for readability)
+print("Saved ComponentModel:")
+print(memory_model.model_dump_json(indent=2))
+```
+
+**Expected Output:**
+```json
+Saved ComponentModel:
+{
+ "provider": "autogen_core.memory.ListMemory",
+ "component_type": "memory",
+ "version": 1,
+ "component_version": 1,
+ "description": "ListMemory stores memory content in a simple list.",
+ "label": "ListMemory",
+ "config": {
+ "name": "user_prefs_v1",
+ "memory_contents": [
+ {
+ "content": "Use formal style",
+ "mime_type": "text/plain",
+ "metadata": null
+ }
+ ]
+ }
+}
+```
+Look at the output! `dump_component` created a `ComponentModel` that contains:
+* `provider`: Exactly which class to use (`autogen_core.memory.ListMemory`).
+* `config`: The specific settings, including the `name` and even the `memory_contents` we added!
+* `component_type`: Its role is `"memory"`.
+* Other useful info like description and version.
+
+You could save this JSON structure to a file (`my_memory_config.json`).
+
+**Step 3: Load the Configuration (`load_component`)**
+
+Now, imagine you're starting a new script or sharing the config file. You can load this `ComponentModel` to recreate the memory instance.
+
+```python
+# File: load_memory_config.py
+from autogen_core import ComponentModel
+from autogen_core.memory import ListMemory # Need the class for type hint/loading
+
+# Assume 'memory_model' is the ComponentModel we just created
+# (or loaded from a file)
+
+print(f"Loading component from ComponentModel (Provider: {memory_model.provider})...")
+
+# Use the ComponentLoader mechanism (available on Component classes)
+# to load the model. We specify the expected type (ListMemory).
+loaded_memory: ListMemory = ListMemory.load_component(memory_model)
+
+print(f"Successfully loaded memory!")
+print(f"- Name: {loaded_memory.name}")
+print(f"- Content: {loaded_memory.content}")
+```
+
+**Expected Output:**
+```
+Loading component from ComponentModel (Provider: autogen_core.memory.ListMemory)...
+Successfully loaded memory!
+- Name: user_prefs_v1
+- Content: [MemoryContent(content='Use formal style', mime_type='text/plain', metadata=None)]
+```
+Success! `load_component` read the `ComponentModel`, found the right class (`ListMemory`), used its `_from_config` method with the saved `config` data, and created a brand new `loaded_memory` instance that is identical to our original `my_memory`.
+
+**Benefits Shown:**
+* **Reproducibility:** We saved the exact state (including content!) and loaded it perfectly.
+* **Configuration:** We could easily save this to a JSON/YAML file and manage it outside our Python code.
+* **Modularity (Conceptual):** If `ListMemory` and `VectorDBMemory` were both Components of type "memory", we could potentially load either one from a configuration file just by changing the `provider` and `config` in the file, without altering the agent code that *uses* the memory component (assuming the agent interacts via the standard `Memory` interface from Chapter 7).
+
+## Under the Hood: How Saving and Loading Work
+
+Let's peek behind the curtain.
+
+**Saving (`dump_component`) Flow:**
+
+```mermaid
+sequenceDiagram
+ participant User
+ participant MyMemory as my_memory (ListMemory instance)
+ participant ListMemConfig as ListMemoryConfig (Pydantic Model)
+ participant CompModel as ComponentModel
+
+ User->>+MyMemory: dump_component()
+ MyMemory->>MyMemory: Calls internal self._to_config()
+ MyMemory->>+ListMemConfig: Creates Config object (name="...", contents=[...])
+ ListMemConfig-->>-MyMemory: Returns Config object
+ MyMemory->>MyMemory: Gets provider string ("autogen_core.memory.ListMemory")
+ MyMemory->>MyMemory: Gets component_type ("memory"), version, etc.
+ MyMemory->>+CompModel: Creates ComponentModel(provider=..., config=config_dict, ...)
+ CompModel-->>-MyMemory: Returns ComponentModel instance
+ MyMemory-->>-User: Returns ComponentModel instance
+```
+
+1. You call `my_memory.dump_component()`.
+2. It calls its own `_to_config()` method. For `ListMemory`, this gathers the `name` and current `_contents`.
+3. `_to_config()` returns a `ListMemoryConfig` object (a Pydantic model) holding these values.
+4. `dump_component()` takes this `ListMemoryConfig` object, converts its data into a dictionary (`config` field).
+5. It figures out its own class path (`provider`) and other metadata (`component_type`, `version`, etc.).
+6. It packages all this into a `ComponentModel` object and returns it.
+
+**Loading (`load_component`) Flow:**
+
+```mermaid
+sequenceDiagram
+ participant User
+ participant Loader as ComponentLoader (e.g., ListMemory.load_component)
+ participant Importer as Python Import System
+ participant ListMemClass as ListMemory (Class definition)
+ participant ListMemConfig as ListMemoryConfig (Pydantic Model)
+ participant NewMemory as New ListMemory Instance
+
+ User->>+Loader: load_component(component_model)
+ Loader->>Loader: Reads provider ("autogen_core.memory.ListMemory") from model
+ Loader->>+Importer: Imports the class `autogen_core.memory.ListMemory`
+ Importer-->>-Loader: Returns ListMemory class object
+ Loader->>+ListMemClass: Checks if it's a valid Component class
+ Loader->>ListMemClass: Gets expected config schema (ListMemoryConfig)
+ Loader->>+ListMemConfig: Validates `config` dict from model against schema
+ ListMemConfig-->>-Loader: Returns validated ListMemoryConfig object
+ Loader->>+ListMemClass: Calls _from_config(validated_config)
+ ListMemClass->>+NewMemory: Creates new ListMemory instance using config
+ NewMemory-->>-ListMemClass: Returns new instance
+ ListMemClass-->>-Loader: Returns new instance
+ Loader-->>-User: Returns the new ListMemory instance
+```
+
+1. You call `ListMemory.load_component(memory_model)`.
+2. The loader reads the `provider` string from `memory_model`.
+3. It dynamically imports the class specified by `provider`.
+4. It verifies this class is a proper `Component` subclass.
+5. It finds the configuration schema defined by the class (e.g., `ListMemoryConfig`).
+6. It validates the `config` dictionary from `memory_model` using this schema.
+7. It calls the class's `_from_config()` method, passing the validated configuration object.
+8. `_from_config()` uses the configuration data to initialize and return a new instance of the class (e.g., a new `ListMemory` with the loaded name and content).
+9. The loader returns this newly created instance.
+
+**Code Glimpse:**
+
+The core logic lives in `_component_config.py`.
+
+* **`Component` Base Class:** Classes like `ListMemory` inherit from `Component`. This requires them to define `component_type`, `component_config_schema`, and implement `_to_config()` and `_from_config()`.
+
+ ```python
+ # From: _component_config.py (Simplified Concept)
+ from pydantic import BaseModel
+ from typing import Type, TypeVar, Generic, ClassVar
+ # ... other imports
+
+ ConfigT = TypeVar("ConfigT", bound=BaseModel)
+
+ class Component(Generic[ConfigT]): # Generic over its config type
+ # Required Class Variables for Concrete Components
+ component_type: ClassVar[str]
+ component_config_schema: Type[ConfigT]
+
+ # Required Instance Method for Saving
+ def _to_config(self) -> ConfigT:
+ raise NotImplementedError
+
+ # Required Class Method for Loading
+ @classmethod
+ def _from_config(cls, config: ConfigT) -> Self:
+ raise NotImplementedError
+
+ # dump_component and load_component are also part of the system
+ # (often inherited from base classes like ComponentBase)
+ def dump_component(self) -> ComponentModel: ...
+ @classmethod
+ def load_component(cls, model: ComponentModel | Dict[str, Any]) -> Self: ...
+ ```
+
+* **`ComponentModel`:** As shown before, a Pydantic model to hold the `provider`, `config`, `type`, etc.
+
+* **`dump_component` Implementation (Conceptual):**
+ ```python
+ # Inside ComponentBase or similar
+ def dump_component(self) -> ComponentModel:
+ # 1. Get the specific config from the instance
+ obj_config: BaseModel = self._to_config()
+ config_dict = obj_config.model_dump() # Convert to dictionary
+
+ # 2. Determine the provider string (class path)
+ provider_str = _type_to_provider_str(self.__class__)
+ # (Handle overrides like self.component_provider_override)
+
+ # 3. Get other metadata
+ comp_type = self.component_type
+ comp_version = self.component_version
+ # ... description, label ...
+
+ # 4. Create and return the ComponentModel
+ model = ComponentModel(
+ provider=provider_str,
+ config=config_dict,
+ component_type=comp_type,
+ version=comp_version,
+ # ... other metadata ...
+ )
+ return model
+ ```
+
+* **`load_component` Implementation (Conceptual):**
+ ```python
+ # Inside ComponentLoader or similar
+ @classmethod
+ def load_component(cls, model: ComponentModel | Dict[str, Any]) -> Self:
+ # 1. Ensure we have a ComponentModel object
+ if isinstance(model, dict):
+ loaded_model = ComponentModel(**model)
+ else:
+ loaded_model = model
+
+ # 2. Import the class based on the provider string
+ provider_str = loaded_model.provider
+ # ... (handle WELL_KNOWN_PROVIDERS mapping) ...
+ module_path, class_name = provider_str.rsplit(".", 1)
+ module = importlib.import_module(module_path)
+ component_class = getattr(module, class_name)
+
+ # 3. Validate the class and config
+ if not is_component_class(component_class): # Check it's a valid Component
+ raise TypeError(...)
+ schema = component_class.component_config_schema
+ validated_config = schema.model_validate(loaded_model.config)
+
+ # 4. Call the class's factory method to create instance
+ instance = component_class._from_config(validated_config)
+
+ # 5. Return the instance (after type checks)
+ return instance
+ ```
+
+This system provides a powerful and consistent way to manage the building blocks of your AutoGen applications.
+
+## Wrapping Up
+
+Congratulations! You've reached the end of our core concepts tour. You now understand the `Component` model โ AutoGen Core's standard way to define configurable, savable, and loadable building blocks like `Memory`, `ChatCompletionClient`, `Tool`, and even aspects of `Agents` themselves.
+
+* **Components** are like standardized Lego bricks.
+* They use **`_to_config`** to describe their settings.
+* They use **`_from_config`** to be built from settings.
+* **`ComponentModel`** is the standard "box" storing the provider and config, enabling saving/loading (often via JSON/YAML).
+
+This promotes:
+* **Modularity:** Easily swap implementations (e.g., different LLM clients).
+* **Reproducibility:** Save and load exact agent system configurations.
+* **Configuration:** Manage settings in external files.
+
+With these eight core concepts (`Agent`, `Messaging`, `AgentRuntime`, `Tool`, `ChatCompletionClient`, `ChatCompletionContext`, `Memory`, and `Component`), you have a solid foundation for understanding and building powerful multi-agent applications with AutoGen Core!
+
+Happy building!
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/AutoGen Core/index.md b/output/AutoGen Core/index.md
new file mode 100644
index 0000000..17fee53
--- /dev/null
+++ b/output/AutoGen Core/index.md
@@ -0,0 +1,47 @@
+# Tutorial: AutoGen Core
+
+AutoGen Core helps you build applications with multiple **_Agents_** that can work together.
+Think of it like creating a team of specialized workers (*Agents*) who can communicate and use tools to solve problems.
+The **_AgentRuntime_** acts as the manager, handling messages and agent lifecycles.
+Agents communicate using a **_Messaging System_** (Topics and Subscriptions), can use **_Tools_** for specific tasks, interact with language models via a **_ChatCompletionClient_** while managing conversation history with **_ChatCompletionContext_**, and remember information using **_Memory_**.
+**_Components_** provide a standard way to define and configure these building blocks.
+
+
+**Source Repository:** [https://github.com/microsoft/autogen/tree/e45a15766746d95f8cfaaa705b0371267bec812e/python/packages/autogen-core/src/autogen_core](https://github.com/microsoft/autogen/tree/e45a15766746d95f8cfaaa705b0371267bec812e/python/packages/autogen-core/src/autogen_core)
+
+```mermaid
+flowchart TD
+ A0["0: Agent"]
+ A1["1: AgentRuntime"]
+ A2["2: Messaging System (Topic & Subscription)"]
+ A3["3: Component"]
+ A4["4: Tool"]
+ A5["5: ChatCompletionClient"]
+ A6["6: ChatCompletionContext"]
+ A7["7: Memory"]
+ A1 -- "Manages lifecycle" --> A0
+ A1 -- "Uses for message routing" --> A2
+ A0 -- "Uses LLM client" --> A5
+ A0 -- "Executes tools" --> A4
+ A0 -- "Accesses memory" --> A7
+ A5 -- "Gets history from" --> A6
+ A5 -- "Uses tool schema" --> A4
+ A7 -- "Updates LLM context" --> A6
+ A4 -- "Implemented as" --> A3
+```
+
+## Chapters
+
+1. [Agent](01_agent.md)
+2. [Messaging System (Topic & Subscription)](02_messaging_system__topic___subscription_.md)
+3. [AgentRuntime](03_agentruntime.md)
+4. [Tool](04_tool.md)
+5. [ChatCompletionClient](05_chatcompletionclient.md)
+6. [ChatCompletionContext](06_chatcompletioncontext.md)
+7. [Memory](07_memory.md)
+8. [Component](08_component.md)
+
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/Browser Use/01_agent.md b/output/Browser Use/01_agent.md
new file mode 100644
index 0000000..e2b5e02
--- /dev/null
+++ b/output/Browser Use/01_agent.md
@@ -0,0 +1,259 @@
+# Chapter 1: The Agent - Your Browser Assistant's Brain
+
+Welcome to the `Browser Use` tutorial! We're excited to help you learn how to automate web tasks using the power of Large Language Models (LLMs).
+
+Imagine you want to perform a simple task, like searching Google for "cute cat pictures" and clicking on the very first image result. For a human, this is easy! You open your browser, type in the search, look at the results, and click.
+
+But how do you tell a computer program to do this? It needs to understand the goal, look at the webpage like a human does, decide what to click or type next, and then actually perform those actions. This is where the **Agent** comes in.
+
+## What Problem Does the Agent Solve?
+
+The Agent is the core orchestrator, the "brain" or "project manager" of your browser automation task. It connects all the different pieces needed to achieve your goal. Without the Agent, you'd have a bunch of tools (like a browser controller and an LLM) but no central coordinator telling them what to do and when.
+
+The Agent solves the problem of turning a high-level goal (like "find cat pictures") into concrete actions on a webpage, using intelligence to adapt to what it "sees" in the browser.
+
+## Meet the Agent: Your Project Manager
+
+Think of the `Agent` like a project manager overseeing a complex task. It doesn't do *all* the work itself, but it coordinates specialists:
+
+1. **Receives the Task:** You give the Agent the overall goal (e.g., "Search Google for 'cute cat pictures' and click the first image result.").
+2. **Consults the Planner (LLM):** The Agent shows the current state of the webpage (using the [BrowserContext](03_browsercontext.md)) to a Large Language Model (LLM). It asks, "Here's the goal, and here's what the webpage looks like right now. What should be the very next step?" The LLM acts as a smart planner, suggesting actions like "type 'cute cat pictures' into the search bar" or "click the element with index 5". We'll learn more about how we instruct the LLM in the [System Prompt](02_system_prompt.md) chapter.
+3. **Manages History:** The Agent keeps track of everything that has happened so far โ the actions taken, the results, and the state of the browser at each step. This "memory" is managed by the [Message Manager](06_message_manager.md) and helps the LLM make better decisions.
+4. **Instructs the Doer (Controller):** Once the LLM suggests an action (like "click element 5"), the Agent tells the [Action Controller & Registry](05_action_controller___registry.md) to actually perform that specific action within the browser.
+5. **Observes the Results (BrowserContext):** After the Controller acts, the Agent uses the [BrowserContext](03_browsercontext.md) again to see the new state of the webpage (e.g., the Google search results page).
+6. **Repeats:** The Agent repeats steps 2-5, continuously consulting the LLM, instructing the Controller, and observing the results, until the original task is complete or it reaches a stopping point.
+
+## Using the Agent: A Simple Example
+
+Let's see how you might use the Agent in Python code. Don't worry about understanding every detail yet; focus on the main idea. We're setting up the Agent with our task and the necessary components.
+
+```python
+# --- Simplified Example ---
+# We need to import the necessary parts from the browser_use library
+from browser_use import Agent, Browser, Controller, BrowserConfig, BrowserContextConfig
+# Assume 'my_llm' is your configured Large Language Model (e.g., from OpenAI, Anthropic)
+from my_llm_setup import my_llm # Placeholder for your specific LLM setup
+
+# 1. Define the task for the Agent
+my_task = "Go to google.com, search for 'cute cat pictures', and click the first image result."
+
+# 2. Basic browser configuration (we'll learn more later)
+browser_config = BrowserConfig() # Default settings
+context_config = BrowserContextConfig() # Default settings
+
+# 3. Initialize the components the Agent needs
+# The Browser manages the underlying browser application
+browser = Browser(config=browser_config)
+# The Controller knows *how* to perform actions like 'click' or 'type'
+controller = Controller()
+
+async def main():
+ # The BrowserContext represents a single browser tab/window environment
+ # It uses the Browser and its configuration
+ async with BrowserContext(browser=browser, config=context_config) as browser_context:
+
+ # 4. Create the Agent instance!
+ agent = Agent(
+ task=my_task,
+ llm=my_llm, # The "brain" - the Language Model
+ browser_context=browser_context, # The "eyes" - interacts with the browser tab
+ controller=controller # The "hands" - executes actions
+ # Many other settings can be configured here!
+ )
+
+ print(f"Agent created. Starting task: {my_task}")
+
+ # 5. Run the Agent! This starts the loop.
+ # It will keep taking steps until the task is done or it hits the limit.
+ history = await agent.run(max_steps=15) # Limit steps for safety
+
+ # 6. Check the result
+ if history.is_done() and history.is_successful():
+ print("โ Agent finished the task successfully!")
+ print(f"Final message from agent: {history.final_result()}")
+ else:
+ print("โ ๏ธ Agent stopped. Maybe max_steps reached or task wasn't completed successfully.")
+
+ # The 'async with' block automatically cleans up the browser_context
+ await browser.close() # Close the browser application
+
+# Run the asynchronous function
+import asyncio
+asyncio.run(main())
+```
+
+**What happens when you run this?**
+
+1. An `Agent` object is created with your task, the LLM, the browser context, and the controller.
+2. Calling `agent.run(max_steps=15)` starts the main loop.
+3. The Agent gets the initial state of the browser (likely a blank page).
+4. It asks the LLM what to do. The LLM might say "Go to google.com".
+5. The Agent tells the Controller to execute the "go to URL" action.
+6. The browser navigates to Google.
+7. The Agent gets the new state (Google's homepage).
+8. It asks the LLM again. The LLM says "Type 'cute cat pictures' into the search bar".
+9. The Agent tells the Controller to type the text.
+10. This continues step-by-step: pressing Enter, seeing results, asking the LLM, clicking the image.
+11. Eventually, the LLM will hopefully tell the Agent the task is "done".
+12. `agent.run()` finishes and returns the `history` object containing details of what happened.
+
+## How it Works Under the Hood: The Agent Loop
+
+Let's visualize the process with a simple diagram:
+
+```mermaid
+sequenceDiagram
+ participant User
+ participant Agent
+ participant LLM
+ participant Controller
+ participant BC as BrowserContext
+
+ User->>Agent: Start task("Search Google for cats...")
+ Note over Agent: Agent Loop Starts
+ Agent->>BC: Get current state (e.g., blank page)
+ BC-->>Agent: Current Page State
+ Agent->>LLM: What's next? (Task + State + History)
+ LLM-->>Agent: Plan: [Action: Type 'cute cat pictures', Action: Press Enter]
+ Agent->>Controller: Execute: type_text(...)
+ Controller->>BC: Perform type action
+ Agent->>Controller: Execute: press_keys('Enter')
+ Controller->>BC: Perform press action
+ Agent->>BC: Get new state (search results page)
+ BC-->>Agent: New Page State
+ Agent->>LLM: What's next? (Task + New State + History)
+ LLM-->>Agent: Plan: [Action: click_element(index=5)]
+ Agent->>Controller: Execute: click_element(index=5)
+ Controller->>BC: Perform click action
+ Note over Agent: Loop continues until done...
+ LLM-->>Agent: Plan: [Action: done(success=True, text='Found cat picture!')]
+ Agent->>Controller: Execute: done(...)
+ Controller-->>Agent: ActionResult (is_done=True)
+ Note over Agent: Agent Loop Ends
+ Agent->>User: Return History (Task Complete)
+
+```
+
+The core of the `Agent` lives in the `agent/service.py` file. The `Agent` class manages the overall process.
+
+1. **Initialization (`__init__`)**: When you create an `Agent`, it sets up its internal state, stores the task, the LLM, the controller, and prepares the [Message Manager](06_message_manager.md) to keep track of the conversation history. It also figures out the best way to talk to the specific LLM you provided.
+
+ ```python
+ # --- File: agent/service.py (Simplified __init__) ---
+ class Agent:
+ def __init__(
+ self,
+ task: str,
+ llm: BaseChatModel,
+ browser_context: BrowserContext,
+ controller: Controller,
+ # ... other settings like use_vision, max_failures, etc.
+ **kwargs
+ ):
+ self.task = task
+ self.llm = llm
+ self.browser_context = browser_context
+ self.controller = controller
+ self.settings = AgentSettings(**kwargs) # Store various settings
+ self.state = AgentState() # Internal state (step count, failures, etc.)
+
+ # Setup message manager for history, using the task and system prompt
+ self._message_manager = MessageManager(
+ task=self.task,
+ system_message=self.settings.system_prompt_class(...).get_system_message(),
+ settings=MessageManagerSettings(...)
+ # ... more setup ...
+ )
+ # ... other initializations ...
+ logger.info("Agent initialized.")
+ ```
+
+2. **Running the Task (`run`)**: The `run` method orchestrates the main loop. It calls the `step` method repeatedly until the task is marked as done, an error occurs, or `max_steps` is reached.
+
+ ```python
+ # --- File: agent/service.py (Simplified run method) ---
+ class Agent:
+ # ... (init) ...
+ async def run(self, max_steps: int = 100) -> AgentHistoryList:
+ self._log_agent_run() # Log start event
+ try:
+ for step_num in range(max_steps):
+ if self.state.stopped or self.state.consecutive_failures >= self.settings.max_failures:
+ break # Stop conditions
+
+ # Wait if paused
+ while self.state.paused: await asyncio.sleep(0.2)
+
+ step_info = AgentStepInfo(step_number=step_num, max_steps=max_steps)
+ await self.step(step_info) # <<< Execute one step of the loop
+
+ if self.state.history.is_done():
+ await self.log_completion() # Log success/failure
+ break # Exit loop if agent signaled 'done'
+ else:
+ logger.info("Max steps reached.") # Ran out of steps
+
+ finally:
+ # ... (cleanup, telemetry, potentially save history/gif) ...
+ pass
+ return self.state.history # Return the recorded history
+ ```
+
+3. **Taking a Step (`step`)**: This is the heart of the loop. In each step, the Agent:
+ * Gets the current browser state (`browser_context.get_state()`).
+ * Adds this state to the history via the `_message_manager`.
+ * Asks the LLM for the next action (`get_next_action()`).
+ * Tells the `Controller` to execute the action(s) (`multi_act()`).
+ * Records the outcome in the history.
+ * Handles any errors that might occur.
+
+ ```python
+ # --- File: agent/service.py (Simplified step method) ---
+ class Agent:
+ # ... (init, run) ...
+ async def step(self, step_info: Optional[AgentStepInfo] = None) -> None:
+ logger.info(f"๐ Step {self.state.n_steps}")
+ state = None
+ model_output = None
+ result: list[ActionResult] = []
+
+ try:
+ # 1. Get current state from the browser
+ state = await self.browser_context.get_state() # Uses BrowserContext
+
+ # 2. Add state (+ previous result) to message history for LLM context
+ self._message_manager.add_state_message(state, self.state.last_result, ...)
+
+ # 3. Get LLM's decision on the next action(s)
+ input_messages = self._message_manager.get_messages()
+ model_output = await self.get_next_action(input_messages) # Calls the LLM
+
+ self.state.n_steps += 1 # Increment step counter
+
+ # 4. Execute the action(s) using the Controller
+ result = await self.multi_act(model_output.action) # Uses Controller
+ self.state.last_result = result # Store result for next step's context
+
+ # 5. Record step details (actions, results, state snapshot)
+ self._make_history_item(model_output, state, result, ...)
+
+ self.state.consecutive_failures = 0 # Reset failure count on success
+
+ except Exception as e:
+ # Handle errors, increment failure count, maybe retry later
+ result = await self._handle_step_error(e)
+ self.state.last_result = result
+ # ... (finally block for logging/telemetry) ...
+ ```
+
+## Conclusion
+
+You've now met the `Agent`, the central coordinator in `Browser Use`. You learned that it acts like a project manager, taking your high-level task, consulting an LLM for step-by-step planning, managing the history, and instructing a `Controller` to perform actions within a `BrowserContext`.
+
+The Agent's effectiveness heavily relies on how well we instruct the LLM planner. In the next chapter, we'll dive into exactly that: crafting the **System Prompt** to guide the LLM's behavior.
+
+[Next Chapter: System Prompt](02_system_prompt.md)
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/Browser Use/02_system_prompt.md b/output/Browser Use/02_system_prompt.md
new file mode 100644
index 0000000..84328f1
--- /dev/null
+++ b/output/Browser Use/02_system_prompt.md
@@ -0,0 +1,235 @@
+# Chapter 2: The System Prompt - Setting the Rules for Your AI Assistant
+
+In [Chapter 1: The Agent](01_agent.md), we met the `Agent`, our project manager for automating browser tasks. We saw it consults a Large Language Model (LLM) โ the "planner" โ to decide the next steps based on the current state of the webpage. But how does the Agent tell the LLM *how* it should think, behave, and respond? Just giving it the task isn't enough!
+
+Imagine hiring a new assistant. You wouldn't just say, "Organize my files!" You'd give them specific instructions: "Please sort the files alphabetically by client name, put them in the blue folders, and give me a summary list when you're done." Without these rules, the assistant might do something completely different!
+
+The **System Prompt** solves this exact problem for our LLM. It's the set of core instructions and rules we give the LLM at the very beginning, telling it exactly how to act as a browser automation assistant and, crucially, how to format its responses so the `Agent` can understand them.
+
+## What is the System Prompt? The AI's Rulebook
+
+Think of the System Prompt like the AI assistant's fundamental operating manual, its "Prime Directive," or the rules of a board game. It defines:
+
+1. **Persona:** "You are an AI agent designed to automate browser tasks."
+2. **Goal:** "Your goal is to accomplish the ultimate task..."
+3. **Input:** How to understand the information it receives about the webpage ([DOM Representation](04_dom_representation.md)).
+4. **Capabilities:** What actions it can take ([Action Controller & Registry](05_action_controller___registry.md)).
+5. **Limitations:** What it *shouldn't* do (e.g., hallucinate actions).
+6. **Response Format:** The *exact* structure (JSON format) its thoughts and planned actions must follow.
+
+Without this rulebook, the LLM might just chat casually, give vague suggestions, or produce output in a format the `Agent` code can't parse. The System Prompt ensures the LLM behaves like the specialized tool we need.
+
+## Why is the Response Format So Important?
+
+This is a critical point. The `Agent` code isn't a human reading the LLM's response. It's a program expecting data in a very specific structure. The System Prompt tells the LLM to *always* respond in a JSON format that looks something like this (simplified):
+
+```json
+{
+ "current_state": {
+ "evaluation_previous_goal": "Success - Found the search bar.",
+ "memory": "On google.com main page. Need to search for cats.",
+ "next_goal": "Type 'cute cat pictures' into the search bar."
+ },
+ "action": [
+ {
+ "input_text": {
+ "index": 5, // The index of the search bar element
+ "text": "cute cat pictures"
+ }
+ },
+ {
+ "press_keys": {
+ "keys": "Enter" // Press the Enter key
+ }
+ }
+ ]
+}
+```
+
+The `Agent` can easily read this JSON:
+* It understands the LLM's thoughts (`current_state`).
+* It sees the exact `action` list the LLM wants to perform.
+* It passes these actions (like `input_text` or `press_keys`) to the [Action Controller & Registry](05_action_controller___registry.md) to execute them in the browser.
+
+If the LLM responded with just "Okay, I'll type 'cute cat pictures' into the search bar and press Enter," the `Agent` wouldn't know *which* element index corresponds to the search bar or exactly which actions to call. The strict JSON format is essential for automation.
+
+## A Peek Inside the Rulebook (`system_prompt.md`)
+
+The actual instructions live in a text file within the `Browser Use` library: `browser_use/agent/system_prompt.md`. It's quite detailed, but here's a tiny snippet focusing on the response format rule:
+
+```markdown
+# Response Rules
+1. RESPONSE FORMAT: You must ALWAYS respond with valid JSON in this exact format:
+{{"current_state": {{"evaluation_previous_goal": "...",
+"memory": "...",
+"next_goal": "..."}},
+"action":[{{"one_action_name": {{...}}}}, ...]}}
+
+2. ACTIONS: You can specify multiple actions in the list... Use maximum {{max_actions}} actions...
+```
+*(This is heavily simplified! The real file has many more rules about element interaction, error handling, task completion, etc.)*
+
+This file clearly defines the JSON structure (`current_state` and `action`) and other crucial behaviors required from the LLM.
+
+## How the Agent Uses the System Prompt
+
+The `Agent` uses a helper class called `SystemPrompt` (found in `agent/prompts.py`) to manage these rules. Here's the flow:
+
+1. **Loading:** When you create an `Agent`, it internally creates a `SystemPrompt` object. This object reads the rules from the `system_prompt.md` file.
+2. **Formatting:** The `SystemPrompt` object formats these rules into a special `SystemMessage` object that LLMs understand as foundational instructions.
+3. **Conversation Start:** This `SystemMessage` is given to the [Message Manager](06_message_manager.md), which keeps track of the conversation history with the LLM. The `SystemMessage` becomes the *very first message*, setting the context for all future interactions in that session.
+
+Think of it like starting a meeting: the first thing you do is state the agenda and rules (System Prompt), and then the discussion (LLM interaction) follows based on that foundation.
+
+Let's look at a simplified view of the `SystemPrompt` class loading the rules:
+
+```python
+# --- File: agent/prompts.py (Simplified) ---
+import importlib.resources # Helps find files within the installed library
+from langchain_core.messages import SystemMessage # Special message type for LLMs
+
+class SystemPrompt:
+ def __init__(self, action_description: str, max_actions_per_step: int = 10):
+ # We ignore these details for now
+ self.default_action_description = action_description
+ self.max_actions_per_step = max_actions_per_step
+ self._load_prompt_template() # <--- Loads the rules file
+
+ def _load_prompt_template(self) -> None:
+ """Load the prompt rules from the system_prompt.md file."""
+ try:
+ # Finds the 'system_prompt.md' file inside the browser_use package
+ filepath = importlib.resources.files('browser_use.agent').joinpath('system_prompt.md')
+ with filepath.open('r') as f:
+ self.prompt_template = f.read() # Read the text content
+ print("System Prompt template loaded successfully!")
+ except Exception as e:
+ print(f"Error loading system prompt: {e}")
+ self.prompt_template = "Error: Could not load prompt." # Fallback
+
+ def get_system_message(self) -> SystemMessage:
+ """Format the loaded rules into a message for the LLM."""
+ # Replace placeholders like {{max_actions}} with actual values
+ prompt = self.prompt_template.format(max_actions=self.max_actions_per_step)
+ # Wrap the final rules text in a SystemMessage object
+ return SystemMessage(content=prompt)
+
+# --- How it plugs into Agent creation (Conceptual) ---
+# from browser_use import Agent, SystemPrompt
+# from my_llm_setup import my_llm # Your LLM
+# ... other setup ...
+
+# When you create an Agent:
+# agent = Agent(
+# task="Find cat pictures",
+# llm=my_llm,
+# browser_context=...,
+# controller=...,
+# # The Agent's __init__ method does something like this internally:
+# # system_prompt_obj = SystemPrompt(action_description="...", max_actions_per_step=10)
+# # system_message_for_llm = system_prompt_obj.get_system_message()
+# # This system_message_for_llm is then passed to the Message Manager.
+# )
+```
+
+This code shows how the `SystemPrompt` class finds and reads the `system_prompt.md` file and prepares the instructions as a `SystemMessage` ready for the LLM conversation.
+
+## Under the Hood: Initialization and Conversation Flow
+
+Let's visualize how the System Prompt fits into the Agent's setup and interaction loop:
+
+```mermaid
+sequenceDiagram
+ participant User
+ participant Agent_Init as Agent Initialization
+ participant SP as SystemPrompt Class
+ participant MM as Message Manager
+ participant Agent_Run as Agent Run Loop
+ participant LLM
+
+ User->>Agent_Init: Create Agent(task, llm, ...)
+ Note over Agent_Init: Agent needs the rules!
+ Agent_Init->>SP: Create SystemPrompt(...)
+ SP->>SP: _load_prompt_template() reads system_prompt.md
+ SP-->>Agent_Init: SystemPrompt instance
+ Agent_Init->>SP: get_system_message()
+ SP-->>Agent_Init: system_message (The Formatted Rules)
+ Note over Agent_Init: Pass rules to conversation manager
+ Agent_Init->>MM: Initialize MessageManager(task, system_message)
+ MM->>MM: Store system_message as message #1
+ MM-->>Agent_Init: MessageManager instance ready
+ Agent_Init-->>User: Agent created and ready
+
+ User->>Agent_Run: agent.run() starts the task
+ Note over Agent_Run: Agent needs context for LLM
+ Agent_Run->>MM: get_messages()
+ MM-->>Agent_Run: [system_message, user_message(state), ...]
+ Note over Agent_Run: Send rules + current state to LLM
+ Agent_Run->>LLM: Ask for next action (Input includes rules)
+ LLM-->>Agent_Run: JSON response (LLM followed rules!)
+ Agent_Run->>MM: add_model_output(...)
+ Note over Agent_Run: Loop continues...
+```
+
+Internally, the `Agent`'s initialization code (`__init__` in `agent/service.py`) explicitly creates the `SystemPrompt` and passes its output to the `MessageManager`:
+
+```python
+# --- File: agent/service.py (Simplified Agent __init__) ---
+# ... other imports ...
+from browser_use.agent.prompts import SystemPrompt # Import the class
+from browser_use.agent.message_manager.service import MessageManager, MessageManagerSettings
+
+class Agent:
+ def __init__(
+ self,
+ task: str,
+ llm: BaseChatModel,
+ browser_context: BrowserContext,
+ controller: Controller,
+ system_prompt_class: Type[SystemPrompt] = SystemPrompt, # Allows customizing the prompt class
+ max_actions_per_step: int = 10,
+ # ... other parameters ...
+ **kwargs
+ ):
+ self.task = task
+ self.llm = llm
+ # ... store other components ...
+
+ # Get the list of available actions from the controller
+ self.available_actions = controller.registry.get_prompt_description()
+
+ # 1. Create the SystemPrompt instance using the provided class
+ system_prompt_instance = system_prompt_class(
+ action_description=self.available_actions,
+ max_actions_per_step=max_actions_per_step,
+ )
+
+ # 2. Get the formatted SystemMessage (the rules)
+ system_message = system_prompt_instance.get_system_message()
+
+ # 3. Initialize the Message Manager with the task and the rules
+ self._message_manager = MessageManager(
+ task=self.task,
+ system_message=system_message, # <--- Pass the rules here!
+ settings=MessageManagerSettings(...)
+ # ... other message manager setup ...
+ )
+ # ... rest of initialization ...
+ logger.info("Agent initialized with System Prompt.")
+```
+
+When the `Agent` runs its loop (`agent.run()` calls `agent.step()`), it asks the `MessageManager` for the current conversation history (`self._message_manager.get_messages()`). The `MessageManager` always ensures that the `SystemMessage` (containing the rules) is the very first item in that history list sent to the LLM.
+
+## Conclusion
+
+The System Prompt is the essential rulebook that governs the LLM's behavior within the `Browser Use` framework. It tells the LLM how to interpret the browser state, what actions it can take, and most importantly, dictates the exact JSON format for its responses. This structured communication is key to enabling the `Agent` to reliably understand the LLM's plan and execute browser automation tasks.
+
+Without a clear System Prompt, the LLM would be like an untrained assistant โ potentially intelligent, but unable to follow the specific procedures needed for the job.
+
+Now that we understand how the `Agent` gets its fundamental instructions, how does it actually perceive the webpage it's supposed to interact with? In the next chapter, we'll explore the component responsible for representing the browser's state: the [BrowserContext](03_browsercontext.md).
+
+[Next Chapter: BrowserContext](03_browsercontext.md)
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/Browser Use/03_browsercontext.md b/output/Browser Use/03_browsercontext.md
new file mode 100644
index 0000000..e642caa
--- /dev/null
+++ b/output/Browser Use/03_browsercontext.md
@@ -0,0 +1,295 @@
+# Chapter 3: BrowserContext - The Agent's Isolated Workspace
+
+In the [previous chapter](02_system_prompt.md), we learned how the `System Prompt` acts as the rulebook for the AI assistant (LLM) that guides our `Agent`. We know the Agent uses the LLM to decide *what* to do next based on the current situation in the browser.
+
+But *where* does the Agent actually "see" the webpage and perform its actions? How does it keep track of the current website address (URL), the page content, and things like cookies, all while staying focused on its specific task without getting mixed up with your other browsing?
+
+This is where the **BrowserContext** comes in.
+
+## What Problem Does BrowserContext Solve?
+
+Imagine you ask your `Agent` to log into a specific online shopping website and check your order status. You might already be logged into that same website in your regular browser window with your personal account.
+
+If the Agent just used your main browser window, it might:
+1. Get confused by your existing login.
+2. Accidentally use your personal cookies or saved passwords.
+3. Interfere with other tabs you have open.
+
+We need a way to give the Agent its *own*, clean, separate browsing environment for each task. It needs an isolated "workspace" where it can open websites, log in, click buttons, and manage its own cookies without affecting anything else.
+
+The `BrowserContext` solves this by representing a single, isolated browser session.
+
+## Meet the BrowserContext: Your Agent's Private Browser Window
+
+Think of a `BrowserContext` like opening a brand new **Incognito Window** or creating a **separate User Profile** in your web browser (like Chrome or Firefox).
+
+* **It's Isolated:** What happens in one `BrowserContext` doesn't affect others or your main browser session. It has its own cookies, its own history (for that session), and its own set of tabs.
+* **It Manages State:** It keeps track of everything important about the current web session the Agent is working on:
+ * The current URL.
+ * Which tabs are open within its "window".
+ * Cookies specific to that session.
+ * The structure and content of the current webpage (the DOM - Document Object Model, which we'll explore in the [next chapter](04_dom_representation.md)).
+* **It's the Agent's Viewport:** The `Agent` looks through the `BrowserContext` to "see" the current state of the webpage. When the Agent decides to perform an action (like clicking a button), it tells the [Action Controller](05_action_controller___registry.md) to perform it *within* that specific `BrowserContext`.
+
+Essentially, the `BrowserContext` is like a dedicated, clean desk or workspace given to the Agent for its specific job.
+
+## Using the BrowserContext
+
+Before we can have an isolated session (`BrowserContext`), we first need the main browser application itself. This is handled by the `Browser` class. Think of `Browser` as the entire Chrome or Firefox application installed on your computer, while `BrowserContext` is just one window or profile within that application.
+
+Here's a simplified example of how you might set up a `Browser` and then create a `BrowserContext` to navigate to a page:
+
+```python
+import asyncio
+# Import necessary classes
+from browser_use import Browser, BrowserConfig, BrowserContext, BrowserContextConfig
+
+async def main():
+ # 1. Configure the main browser application (optional, defaults are usually fine)
+ browser_config = BrowserConfig(headless=False) # Show the browser window
+
+ # 2. Create the main Browser instance
+ # This might launch a browser application in the background (or connect to one)
+ browser = Browser(config=browser_config)
+ print("Browser application instance created.")
+
+ # 3. Configure the specific session/window (optional)
+ context_config = BrowserContextConfig(
+ user_agent="MyCoolAgent/1.0", # Example: Set a custom user agent
+ cookies_file="my_session_cookies.json" # Example: Save/load cookies
+ )
+
+ # 4. Create the isolated BrowserContext (like opening an incognito window)
+ # We use 'async with' to ensure it cleans up automatically afterwards
+ async with browser.new_context(config=context_config) as browser_context:
+ print(f"BrowserContext created (ID: {browser_context.context_id}).")
+
+ # 5. Use the context to interact with the browser session
+ start_url = "https://example.com"
+ print(f"Navigating to: {start_url}")
+ await browser_context.navigate_to(start_url)
+
+ # 6. Get information *from* the context
+ current_state = await browser_context.get_state() # Get current page info
+ print(f"Current page title: {current_state.title}")
+ print(f"Current page URL: {current_state.url}")
+
+ # The Agent would use this 'browser_context' object to see the page
+ # and tell the Controller to perform actions within it.
+
+ print("BrowserContext closed automatically.")
+
+ # 7. Close the main browser application when done
+ await browser.close()
+ print("Browser application closed.")
+
+# Run the asynchronous code
+asyncio.run(main())
+```
+
+**What happens here?**
+
+1. We set up a `BrowserConfig` (telling it *not* to run headless so we can see the window).
+2. We create a `Browser` instance, which represents the overall browser program.
+3. We create a `BrowserContextConfig` to specify settings for our isolated session (like a custom name or where to save cookies).
+4. Crucially, `browser.new_context(...)` creates our isolated session. The `async with` block ensures this session is properly closed later.
+5. We use methods *on the `browser_context` object* like `navigate_to()` to control *this specific session*.
+6. We use `browser_context.get_state()` to get information about the current page within *this session*. The `Agent` heavily relies on this method.
+7. After the `async with` block finishes, the `browser_context` is closed (like closing the incognito window), and finally, we close the main `browser` application.
+
+## How it Works Under the Hood
+
+When the `Agent` needs to understand the current situation to decide the next step, it asks the `BrowserContext` for the latest state using the `get_state()` method. What happens then?
+
+1. **Wait for Stability:** The `BrowserContext` first waits for the webpage to finish loading and for network activity to settle down (`_wait_for_page_and_frames_load`). This prevents the Agent from acting on an incomplete page.
+2. **Analyze the Page:** It then uses the [DOM Representation](04_dom_representation.md) service (`DomService`) to analyze the current HTML structure of the page. This service figures out which elements are visible, interactive (buttons, links, input fields), and where they are.
+3. **Capture Visuals:** It often takes a screenshot of the current view (`take_screenshot`). This can be helpful for advanced agents or debugging.
+4. **Gather Metadata:** It gets the current URL, page title, and information about any other tabs open *within this context*.
+5. **Package the State:** All this information (DOM structure, URL, title, screenshot, etc.) is bundled into a `BrowserState` object.
+6. **Return to Agent:** The `BrowserContext` returns this `BrowserState` object to the `Agent`. The Agent then uses this information (often sending it to the LLM) to plan its next action.
+
+Here's a simplified diagram of the `get_state()` process:
+
+```mermaid
+sequenceDiagram
+ participant Agent
+ participant BC as BrowserContext
+ participant PlaywrightPage as Underlying Browser Page
+ participant DomService as DOM Service
+
+ Agent->>BC: get_state()
+ Note over BC: Wait for page to be ready...
+ BC->>PlaywrightPage: Ensure page/network is stable
+ PlaywrightPage-->>BC: Page is ready
+ Note over BC: Analyze the page content...
+ BC->>DomService: Get simplified DOM structure + interactive elements
+ DomService-->>BC: DOMState (element tree, etc.)
+ Note over BC: Get visuals and metadata...
+ BC->>PlaywrightPage: Take screenshot()
+ PlaywrightPage-->>BC: Screenshot data
+ BC->>PlaywrightPage: Get URL, Title
+ PlaywrightPage-->>BC: URL, Title data
+ Note over BC: Combine everything...
+ BC->>BC: Create BrowserState object
+ BC-->>Agent: Return BrowserState
+```
+
+Let's look at some simplified code snippets from the library.
+
+The `BrowserContext` is initialized (`__init__` in `browser/context.py`) with its configuration and a reference to the main `Browser` instance that created it.
+
+```python
+# --- File: browser/context.py (Simplified __init__) ---
+import uuid
+# ... other imports ...
+if TYPE_CHECKING:
+ from browser_use.browser.browser import Browser # Link to the Browser class
+
+@dataclass
+class BrowserContextConfig: # Configuration settings
+ # ... various settings like user_agent, cookies_file, window_size ...
+ pass
+
+@dataclass
+class BrowserSession: # Holds the actual Playwright context
+ context: PlaywrightBrowserContext # The underlying Playwright object
+ cached_state: Optional[BrowserState] = None # Stores the last known state
+
+class BrowserContext:
+ def __init__(
+ self,
+ browser: 'Browser', # Reference to the main Browser instance
+ config: BrowserContextConfig = BrowserContextConfig(),
+ # ... other optional state ...
+ ):
+ self.context_id = str(uuid.uuid4()) # Unique ID for this session
+ self.config = config # Store the configuration
+ self.browser = browser # Store the reference to the parent Browser
+
+ # The actual Playwright session is created later, when needed
+ self.session: BrowserSession | None = None
+ logger.debug(f"BrowserContext object created (ID: {self.context_id}). Session not yet initialized.")
+
+ # The 'async with' statement calls __aenter__ which initializes the session
+ async def __aenter__(self):
+ await self._initialize_session() # Creates the actual browser window/tab
+ return self
+
+ async def _initialize_session(self):
+ # ... (complex setup code happens here) ...
+ # Gets the main Playwright browser from self.browser
+ playwright_browser = await self.browser.get_playwright_browser()
+ # Creates the isolated Playwright context (like the incognito window)
+ context = await self._create_context(playwright_browser)
+ # Creates the BrowserSession to hold the context and state
+ self.session = BrowserSession(context=context, cached_state=None)
+ logger.debug(f"BrowserContext session initialized (ID: {self.context_id}).")
+ # ... (sets up the initial page) ...
+ return self.session
+
+ # ... other methods like navigate_to, close, etc. ...
+```
+
+The `get_state` method orchestrates fetching the current information from the browser session.
+
+```python
+# --- File: browser/context.py (Simplified get_state and helpers) ---
+# ... other imports ...
+from browser_use.dom.service import DomService # Imports the DOM analyzer
+from browser_use.browser.views import BrowserState # Imports the state structure
+
+class BrowserContext:
+ # ... (init, aenter, etc.) ...
+
+ async def get_state(self) -> BrowserState:
+ """Get the current state of the browser session."""
+ logger.debug(f"Getting state for context {self.context_id}...")
+ # 1. Make sure the page is loaded and stable
+ await self._wait_for_page_and_frames_load()
+
+ # 2. Get the actual Playwright session object
+ session = await self.get_session()
+
+ # 3. Update the state (this does the heavy lifting)
+ session.cached_state = await self._update_state()
+ logger.debug(f"State update complete for {self.context_id}.")
+
+ # 4. Optionally save cookies if configured
+ if self.config.cookies_file:
+ asyncio.create_task(self.save_cookies())
+
+ return session.cached_state
+
+ async def _wait_for_page_and_frames_load(self, timeout_overwrite: float | None = None):
+ """Ensures page is fully loaded before continuing."""
+ # ... (complex logic to wait for network idle, minimum times) ...
+ page = await self.get_current_page()
+ await page.wait_for_load_state('load', timeout=5000) # Simplified wait
+ logger.debug("Page load/network stability checks passed.")
+ await asyncio.sleep(self.config.minimum_wait_page_load_time) # Ensure minimum wait
+
+ async def _update_state(self) -> BrowserState:
+ """Fetches all info and builds the BrowserState."""
+ session = await self.get_session()
+ page = await self.get_current_page() # Get the active Playwright page object
+
+ try:
+ # Use DomService to analyze the page content
+ dom_service = DomService(page)
+ # Get the simplified DOM tree and interactive elements map
+ content_info = await dom_service.get_clickable_elements(
+ highlight_elements=self.config.highlight_elements,
+ # ... other DOM options ...
+ )
+
+ # Take a screenshot
+ screenshot_b64 = await self.take_screenshot()
+
+ # Get URL, Title, Tabs, Scroll info etc.
+ url = page.url
+ title = await page.title()
+ tabs = await self.get_tabs_info()
+ pixels_above, pixels_below = await self.get_scroll_info(page)
+
+ # Create the BrowserState object
+ browser_state = BrowserState(
+ element_tree=content_info.element_tree,
+ selector_map=content_info.selector_map,
+ url=url,
+ title=title,
+ tabs=tabs,
+ screenshot=screenshot_b64,
+ pixels_above=pixels_above,
+ pixels_below=pixels_below,
+ )
+ return browser_state
+
+ except Exception as e:
+ logger.error(f'Failed to update state: {str(e)}')
+ # Maybe return old state or raise error
+ raise BrowserError("Failed to get browser state") from e
+
+ async def take_screenshot(self, full_page: bool = False) -> str:
+ """Takes a screenshot and returns base64 encoded string."""
+ page = await self.get_current_page()
+ screenshot_bytes = await page.screenshot(full_page=full_page, animations='disabled')
+ return base64.b64encode(screenshot_bytes).decode('utf-8')
+
+ # ... many other helper methods (_get_current_page, get_tabs_info, etc.) ...
+
+```
+This shows how `BrowserContext` acts as a manager for a specific browser session, using underlying tools (like Playwright and `DomService`) to gather the necessary information (`BrowserState`) that the `Agent` needs to operate.
+
+## Conclusion
+
+The `BrowserContext` is a fundamental concept in `Browser Use`. It provides the necessary **isolated environment** for the `Agent` to perform its tasks, much like an incognito window or a separate browser profile. It manages the session's state (URL, cookies, tabs, page content) and provides the `Agent` with a snapshot of the current situation via the `get_state()` method.
+
+Understanding the `BrowserContext` helps clarify *where* the Agent works. Now, how does the Agent actually understand the *content* of the webpage within that context? How is the complex structure of a webpage represented in a way the Agent (and the LLM) can understand?
+
+In the next chapter, we'll dive into exactly that: the [DOM Representation](04_dom_representation.md).
+
+[Next Chapter: DOM Representation](04_dom_representation.md)
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
\ No newline at end of file
diff --git a/output/Browser Use/04_dom_representation.md b/output/Browser Use/04_dom_representation.md
new file mode 100644
index 0000000..e83263c
--- /dev/null
+++ b/output/Browser Use/04_dom_representation.md
@@ -0,0 +1,316 @@
+# Chapter 4: DOM Representation - Mapping the Webpage
+
+In the [previous chapter](03_browsercontext.md), we learned about the `BrowserContext`, the Agent's private workspace for browsing. We saw that the Agent uses `browser_context.get_state()` to get a snapshot of the current webpage. But how does the Agent actually *understand* the content of that snapshot?
+
+Imagine you're looking at the Google homepage. You instantly recognize the logo, the search bar, and the buttons. But a computer program just sees a wall of code (HTML). How can our `Agent` figure out: "This rectangular box is the search bar I need to type into," or "This specific image link is the first result I should click"?
+
+This is the problem solved by **DOM Representation**.
+
+## What Problem Does DOM Representation Solve?
+
+Webpages are built using HTML (HyperText Markup Language), which describes the structure and content. Your browser reads this HTML and creates an internal, structured representation called the **Document Object Model (DOM)**. It's like the browser builds a detailed blueprint or an outline from the HTML instructions.
+
+However, this raw DOM blueprint is incredibly complex and contains lots of information irrelevant to our Agent's task. The Agent doesn't need to know about every single tiny visual detail; it needs a *simplified map* focused on what's important for interaction:
+
+1. **What elements are on the page?** (buttons, links, input fields, text)
+2. **Are they visible to a user?** (Hidden elements shouldn't be interacted with)
+3. **Are they interactive?** (Can you click it? Can you type in it?)
+4. **How can the Agent refer to them?** (We need a simple way to say "click *this* button")
+
+DOM Representation solves the problem of translating the complex, raw DOM blueprint into a simplified, structured map that highlights the interactive "landmarks" and pathways the Agent can use.
+
+## Meet `DomService`: The Map Maker
+
+The component responsible for creating this map is the `DomService`. Think of it as a cartographer specializing in webpages.
+
+When the `Agent` (via the `BrowserContext`) asks for the current state of the page, the `BrowserContext` employs the `DomService` to analyze the page's live DOM.
+
+Here's what the `DomService` does:
+
+1. **Examines the Live Page:** It looks at the current structure rendered in the browser tab, not just the initial HTML source code (because JavaScript can change the page after it loads).
+2. **Identifies Elements:** It finds all the meaningful elements like buttons, links, input fields, and text blocks.
+3. **Checks Properties:** For each element, it determines crucial properties:
+ * **Visibility:** Is it actually displayed on the screen?
+ * **Interactivity:** Is it something a user can click, type into, or otherwise interact with?
+ * **Position:** Where is it located (roughly)?
+4. **Assigns Interaction Indices:** This is key! For elements deemed interactive and visible, `DomService` assigns a unique number, called a `highlight_index` (like `[5]`, `[12]`, etc.). This gives the Agent and the LLM a simple, unambiguous way to refer to specific elements.
+5. **Builds a Structured Tree:** It organizes this information into a simplified tree structure (`element_tree`) that reflects the page layout but is much easier to process than the full DOM.
+6. **Creates an Index Map:** It generates a `selector_map`, which is like an index in a book, mapping each `highlight_index` directly to its corresponding element node in the tree.
+
+The final output is a `DOMState` object containing the simplified `element_tree` and the handy `selector_map`. This `DOMState` is then included in the `BrowserState` that `BrowserContext.get_state()` returns to the Agent.
+
+## The Output: `DOMState` - The Agent's Map
+
+The `DOMState` object produced by `DomService` has two main parts:
+
+1. **`element_tree`:** This is the root of our simplified map, represented as a `DOMElementNode` object (defined in `dom/views.py`). Each node in the tree can be either an element (`DOMElementNode`) or a piece of text (`DOMTextNode`). `DOMElementNode`s contain information like the tag name (`