mirror of
https://github.com/aljazceru/Tutorial-Codebase-Knowledge.git
synced 2025-12-19 07:24:20 +01:00
init push
This commit is contained in:
207
docs/CrewAI/01_crew.md
Normal file
207
docs/CrewAI/01_crew.md
Normal file
@@ -0,0 +1,207 @@
|
||||
# Chapter 1: Crew - Your AI Team Manager
|
||||
|
||||
Welcome to the world of CrewAI! We're excited to help you build teams of AI agents that can work together to accomplish complex tasks.
|
||||
|
||||
Imagine you have a big project, like planning a surprise birthday trip for a friend. Doing it all yourself – researching destinations, checking flight prices, finding hotels, planning activities – can be overwhelming. Wouldn't it be great if you had a team to help? Maybe one person researches cool spots, another finds the best travel deals, and you coordinate everything.
|
||||
|
||||
That's exactly what a `Crew` does in CrewAI! It acts like the **project manager** or even the **entire team** itself, bringing together specialized AI assistants ([Agents](02_agent.md)) and telling them what [Tasks](03_task.md) to do and in what order.
|
||||
|
||||
**What Problem Does `Crew` Solve?**
|
||||
|
||||
Single AI models are powerful, but complex goals often require multiple steps and different kinds of expertise. A `Crew` allows you to break down a big goal into smaller, manageable [Tasks](03_task.md) and assign each task to the best AI [Agent](02_agent.md) for the job. It then manages how these agents work together to achieve the overall objective.
|
||||
|
||||
## What is a Crew?
|
||||
|
||||
Think of a `Crew` as the central coordinator. It holds everything together:
|
||||
|
||||
1. **The Team ([Agents](02_agent.md)):** It knows which AI agents are part of the team. Each agent might have a specific role (like 'Travel Researcher' or 'Booking Specialist').
|
||||
2. **The Plan ([Tasks](03_task.md)):** It holds the list of tasks that need to be completed to achieve the final goal (e.g., 'Research European cities', 'Find affordable flights', 'Book hotel').
|
||||
3. **The Workflow ([Process](05_process.md)):** It defines *how* the team works. Should they complete tasks one after another (`sequential`)? Or should there be a manager agent delegating work (`hierarchical`)?
|
||||
4. **Collaboration:** It orchestrates how agents share information and pass results from one task to the next.
|
||||
|
||||
## Let's Build a Simple Crew!
|
||||
|
||||
Let's try building a very basic `Crew` for our trip planning example. For now, we'll just set up the structure. We'll learn more about creating sophisticated [Agents](02_agent.md) and [Tasks](03_task.md) in the next chapters.
|
||||
|
||||
```python
|
||||
# Import necessary classes (we'll learn about these soon!)
|
||||
from crewai import Agent, Task, Crew, Process
|
||||
|
||||
# Define our agents (don't worry about the details for now)
|
||||
# Agent 1: The Researcher
|
||||
researcher = Agent(
|
||||
role='Travel Researcher',
|
||||
goal='Find interesting cities in Europe for a birthday trip',
|
||||
backstory='An expert travel researcher.',
|
||||
# verbose=True, # Optional: Shows agent's thinking process
|
||||
allow_delegation=False # This agent doesn't delegate work
|
||||
# llm=your_llm # We'll cover LLMs later!
|
||||
)
|
||||
|
||||
# Agent 2: The Planner
|
||||
planner = Agent(
|
||||
role='Activity Planner',
|
||||
goal='Create a fun 3-day itinerary for the chosen city',
|
||||
backstory='An experienced activity planner.',
|
||||
# verbose=True,
|
||||
allow_delegation=False
|
||||
# llm=your_llm
|
||||
)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We import `Agent`, `Task`, `Crew`, and `Process` from the `crewai` library.
|
||||
* We create two simple [Agents](02_agent.md). We give them a `role` and a `goal`. Think of these as job titles and descriptions for our AI assistants. (We'll dive deep into Agents in [Chapter 2](02_agent.md)).
|
||||
|
||||
Now, let's define the [Tasks](03_task.md) for these agents:
|
||||
|
||||
```python
|
||||
# Define the tasks
|
||||
task1 = Task(
|
||||
description='Identify the top 3 European cities suitable for a sunny birthday trip in May.',
|
||||
expected_output='A list of 3 cities with brief reasons.',
|
||||
agent=researcher # Assign task1 to the researcher agent
|
||||
)
|
||||
|
||||
task2 = Task(
|
||||
description='Based on the chosen city from task 1, create a 3-day activity plan.',
|
||||
expected_output='A detailed itinerary for 3 days.',
|
||||
agent=planner # Assign task2 to the planner agent
|
||||
)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We create two [Tasks](03_task.md). Each task has a `description` (what to do) and an `expected_output` (what the result should look like).
|
||||
* Crucially, we assign each task to an `agent`. `task1` goes to the `researcher`, and `task2` goes to the `planner`. (More on Tasks in [Chapter 3](03_task.md)).
|
||||
|
||||
Finally, let's assemble the `Crew`:
|
||||
|
||||
```python
|
||||
# Create the Crew
|
||||
trip_crew = Crew(
|
||||
agents=[researcher, planner],
|
||||
tasks=[task1, task2],
|
||||
process=Process.sequential # Tasks will run one after another
|
||||
# verbose=2 # Optional: Sets verbosity level for the crew execution
|
||||
)
|
||||
|
||||
# Start the Crew's work!
|
||||
result = trip_crew.kickoff()
|
||||
|
||||
print("\n\n########################")
|
||||
print("## Here is the result")
|
||||
print("########################\n")
|
||||
print(result)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
1. We create an instance of the `Crew` class.
|
||||
2. We pass the list of `agents` we defined earlier.
|
||||
3. We pass the list of `tasks`. The order in this list matters for the sequential process.
|
||||
4. We set the `process` to `Process.sequential`. This means `task1` will be completed first by the `researcher`, and its output will *automatically* be available as context for `task2` when the `planner` starts working.
|
||||
5. We call the `kickoff()` method. This is like saying "Okay team, start working!"
|
||||
6. The `Crew` manages the execution, ensuring the `researcher` does `task1`, then the `planner` does `task2`.
|
||||
7. The `result` will contain the final output from the *last* task (`task2` in this case).
|
||||
|
||||
**Expected Outcome (Conceptual):**
|
||||
|
||||
When you run this (assuming you have underlying AI models configured, which we'll cover in the [LLM chapter](06_llm.md)), the `Crew` will:
|
||||
|
||||
1. Ask the `researcher` agent to perform `task1`.
|
||||
2. The `researcher` will (conceptually) think and produce a list like: "1. Barcelona (Sunny, vibrant) 2. Lisbon (Coastal, historic) 3. Rome (Iconic, warm)".
|
||||
3. The `Crew` takes this output and gives it to the `planner` agent along with `task2`.
|
||||
4. The `planner` agent uses the city list (and likely picks one, or you'd refine the task) and creates a 3-day itinerary.
|
||||
5. The final `result` printed will be the 3-day itinerary generated by the `planner`.
|
||||
|
||||
## How Does `Crew.kickoff()` Work Inside?
|
||||
|
||||
You don't *need* to know the deep internals to use CrewAI, but understanding the basics helps! When you call `kickoff()`:
|
||||
|
||||
1. **Input Check:** It checks if you provided any starting inputs (we didn't in this simple example, but you could provide a starting topic or variable).
|
||||
2. **Agent & Task Setup:** It makes sure all agents and tasks are ready to go. It ensures agents have the necessary configurations ([LLMs](06_llm.md), [Tools](04_tool.md) - more on these later!).
|
||||
3. **Process Execution:** It looks at the chosen `process` (e.g., `sequential`).
|
||||
* **Sequential:** It runs tasks one by one. The output of task `N` is added to the context for task `N+1`.
|
||||
* **Hierarchical (Advanced):** If you chose this process, the Crew would use a dedicated 'manager' agent to coordinate the other agents and decide who does what next. We'll stick to sequential for now.
|
||||
4. **Task Execution Loop:**
|
||||
* It picks the next task based on the process.
|
||||
* It finds the assigned agent for that task.
|
||||
* It gives the agent the task description and any relevant context (like outputs from previous tasks).
|
||||
* The agent performs the task using its underlying AI model ([LLM](06_llm.md)).
|
||||
* The agent returns the result (output) of the task.
|
||||
* The Crew stores this output.
|
||||
* Repeat until all tasks are done.
|
||||
5. **Final Output:** The `Crew` packages the output from the final task (and potentially outputs from all tasks) and returns it.
|
||||
|
||||
Let's visualize the `sequential` process:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant MyCrew as Crew
|
||||
participant ResearcherAgent as Researcher
|
||||
participant PlannerAgent as Planner
|
||||
|
||||
User->>MyCrew: kickoff()
|
||||
MyCrew->>ResearcherAgent: Execute Task 1 ("Find cities...")
|
||||
Note right of ResearcherAgent: Researcher thinks... generates city list.
|
||||
ResearcherAgent-->>MyCrew: Task 1 Output ("Barcelona, Lisbon, Rome...")
|
||||
MyCrew->>PlannerAgent: Execute Task 2 ("Create itinerary...") \nwith Task 1 Output as context
|
||||
Note right of PlannerAgent: Planner thinks... uses city list, creates itinerary.
|
||||
PlannerAgent-->>MyCrew: Task 2 Output ("Day 1: ..., Day 2: ...")
|
||||
MyCrew-->>User: Final Result (Task 2 Output)
|
||||
```
|
||||
|
||||
**Code Glimpse (`crew.py` simplified):**
|
||||
|
||||
The `Crew` class itself is defined in `crewai/crew.py`. It takes parameters like `agents`, `tasks`, and `process` when you create it.
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/crew.py
|
||||
class Crew(BaseModel):
|
||||
tasks: List[Task] = Field(default_factory=list)
|
||||
agents: List[BaseAgent] = Field(default_factory=list)
|
||||
process: Process = Field(default=Process.sequential)
|
||||
# ... other configurations like memory, cache, etc.
|
||||
|
||||
def kickoff(self, inputs: Optional[Dict[str, Any]] = None) -> CrewOutput:
|
||||
# ... setup steps ...
|
||||
|
||||
# Decides which execution path based on the process
|
||||
if self.process == Process.sequential:
|
||||
result = self._run_sequential_process()
|
||||
elif self.process == Process.hierarchical:
|
||||
result = self._run_hierarchical_process()
|
||||
else:
|
||||
# Handle other processes or errors
|
||||
raise NotImplementedError(...)
|
||||
|
||||
# ... cleanup and formatting steps ...
|
||||
return result # Returns a CrewOutput object
|
||||
|
||||
def _run_sequential_process(self) -> CrewOutput:
|
||||
# Simplified loop logic
|
||||
task_outputs = []
|
||||
for task in self.tasks:
|
||||
agent = task.agent # Find the agent for this task
|
||||
context = self._get_context(task, task_outputs) # Get outputs from previous tasks
|
||||
# Execute the task (sync or async)
|
||||
output = task.execute_sync(agent=agent, context=context)
|
||||
task_outputs.append(output)
|
||||
# ... logging/callbacks ...
|
||||
return self._create_crew_output(task_outputs) # Package final result
|
||||
```
|
||||
|
||||
This simplified view shows how the `Crew` holds the `agents` and `tasks`, and the `kickoff` method directs traffic based on the chosen `process`, eventually looping through tasks sequentially if `Process.sequential` is selected.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned about the most fundamental concept in CrewAI: the `Crew`! It's the manager that brings your AI agents together, gives them tasks, and defines how they collaborate to achieve a larger goal. We saw how to define agents and tasks (at a high level) and assemble them into a `Crew` using a `sequential` process.
|
||||
|
||||
But a Crew is nothing without its members! In the next chapter, we'll dive deep into the first core component: the [Agent](02_agent.md). What makes an agent tick? How do you define their roles, goals, and capabilities? Let's find out!
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
178
docs/CrewAI/02_agent.md
Normal file
178
docs/CrewAI/02_agent.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Chapter 2: Agent - Your Specialized AI Worker
|
||||
|
||||
In [Chapter 1](01_crew.md), we learned about the `Crew` – the manager that organizes our AI team. But a manager needs a team to manage! That's where `Agent`s come in.
|
||||
|
||||
## Why Do We Need Agents?
|
||||
|
||||
Imagine our trip planning `Crew` again. The `Crew` knows the overall goal (plan a surprise trip), but it doesn't *do* the research or the planning itself. It needs specialists.
|
||||
|
||||
* One specialist could be excellent at researching travel destinations.
|
||||
* Another could be fantastic at creating detailed itineraries.
|
||||
|
||||
In CrewAI, these specialists are called **`Agent`s**. Instead of having one super-smart AI try to juggle everything, we create multiple `Agent`s, each with its own focus and expertise. This makes complex tasks more manageable and often leads to better results.
|
||||
|
||||
**Problem Solved:** `Agent`s allow you to break down a large task into smaller pieces and assign each piece to an AI worker specifically designed for it.
|
||||
|
||||
## What is an Agent?
|
||||
|
||||
Think of an `Agent` as a **dedicated AI worker** on your `Crew`. Each `Agent` has a unique profile that defines who they are and what they do:
|
||||
|
||||
1. **`role`**: This is the Agent's job title. What function do they perform in the team? Examples: 'Travel Researcher', 'Marketing Analyst', 'Code Reviewer', 'Blog Post Writer'.
|
||||
2. **`goal`**: This is the Agent's primary objective. What specific outcome are they trying to achieve within their role? Examples: 'Find the top 3 family-friendly European destinations', 'Analyze competitor website traffic', 'Identify bugs in Python code', 'Draft an engaging blog post about AI'.
|
||||
3. **`backstory`**: This is the Agent's personality, skills, and history. It tells the AI *how* to behave and what expertise it possesses. It adds flavour and context. Examples: 'An expert travel agent with 20 years of experience in European travel.', 'A data-driven market analyst known for spotting emerging trends.', 'A meticulous senior software engineer obsessed with code quality.', 'A witty content creator known for simplifying complex topics.'
|
||||
4. **`llm`** (Optional): This is the Agent's "brain" – the specific Large Language Model (like GPT-4, Gemini, etc.) it uses to think, communicate, and execute tasks. We'll cover this more in the [LLM chapter](06_llm.md). If not specified, it usually inherits the `Crew`'s default LLM.
|
||||
5. **`tools`** (Optional): These are special capabilities the Agent can use, like searching the web, using a calculator, or reading files. Think of them as the Agent's equipment. We'll explore these in the [Tool chapter](04_tool.md).
|
||||
6. **`allow_delegation`** (Optional, default `False`): Can this Agent ask other Agents in the `Crew` for help with a sub-task? If `True`, it enables collaboration.
|
||||
7. **`verbose`** (Optional, default `False`): If `True`, the Agent will print out its thought process as it works, which is great for debugging and understanding what's happening.
|
||||
|
||||
An Agent takes the [Tasks](03_task.md) assigned to it by the `Crew` and uses its `role`, `goal`, `backstory`, `llm`, and `tools` to complete them.
|
||||
|
||||
## Let's Define an Agent!
|
||||
|
||||
Let's revisit the `researcher` Agent from Chapter 1 and look closely at how it's defined.
|
||||
|
||||
```python
|
||||
# Make sure you have crewai installed
|
||||
# pip install crewai
|
||||
|
||||
from crewai import Agent
|
||||
|
||||
# Define our researcher agent
|
||||
researcher = Agent(
|
||||
role='Expert Travel Researcher',
|
||||
goal='Find the most exciting and sunny European cities for a birthday trip in late May.',
|
||||
backstory=(
|
||||
"You are a world-class travel researcher with deep knowledge of "
|
||||
"European destinations. You excel at finding hidden gems and understanding "
|
||||
"weather patterns. Your recommendations are always insightful and tailored."
|
||||
),
|
||||
verbose=True, # We want to see the agent's thinking process
|
||||
allow_delegation=False # This agent focuses on its own research
|
||||
# tools=[...] # We'll add tools later!
|
||||
# llm=your_llm # We'll cover LLMs later!
|
||||
)
|
||||
|
||||
# (You would typically define other agents, tasks, and a crew here)
|
||||
# print(researcher) # Just to see the object
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* `from crewai import Agent`: We import the necessary `Agent` class.
|
||||
* `role='Expert Travel Researcher'`: We clearly define the agent's job title. This tells the LLM its primary function.
|
||||
* `goal='Find the most exciting...'`: We give it a specific, measurable objective. This guides its actions.
|
||||
* `backstory='You are a world-class...'`: We provide context and personality. This influences the *style* and *quality* of its output. Notice the detailed description – this helps the LLM adopt the persona.
|
||||
* `verbose=True`: We'll see detailed logs of this agent's thoughts and actions when it runs.
|
||||
* `allow_delegation=False`: This researcher won't ask other agents for help; it will complete its task independently.
|
||||
|
||||
Running this code snippet creates an `Agent` object in Python. This object is now ready to be added to a [Crew](01_crew.md) and assigned [Tasks](03_task.md).
|
||||
|
||||
## How Agents Work "Under the Hood"
|
||||
|
||||
So, what happens when an `Agent` is given a task by the `Crew`?
|
||||
|
||||
1. **Receive Task & Context:** The `Agent` gets the task description (e.g., "Find 3 sunny cities") and potentially some context from previous tasks (e.g., "The user prefers coastal cities").
|
||||
2. **Consult Profile:** It looks at its own `role`, `goal`, and `backstory`. This helps it frame *how* to tackle the task. Our 'Expert Travel Researcher' will approach this differently than a 'Budget Backpacker Blogger'.
|
||||
3. **Think & Plan (Using LLM):** The `Agent` uses its assigned `llm` (its brain) to think. It breaks down the task, formulates a plan, and decides what information it needs. This often involves an internal "monologue" (which you can see if `verbose=True`).
|
||||
4. **Use Tools (If Necessary):** If the plan requires external information or actions (like searching the web for current weather or calculating travel times), and the agent *has* the right [Tools](04_tool.md), it will use them.
|
||||
5. **Delegate (If Allowed & Necessary):** If `allow_delegation=True` and the `Agent` decides a sub-part of the task is better handled by another specialist `Agent` in the `Crew`, it can ask the `Crew` to delegate that part.
|
||||
6. **Generate Output (Using LLM):** Based on its thinking, tool results, and potentially delegated results, the `Agent` uses its `llm` again to formulate the final response or output for the task.
|
||||
7. **Return Result:** The `Agent` passes its completed work back to the `Crew`.
|
||||
|
||||
Let's visualize this simplified flow:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant C as Crew
|
||||
participant MyAgent as Agent (Researcher)
|
||||
participant LLM as Agent's Brain
|
||||
participant SearchTool as Tool
|
||||
|
||||
C->>MyAgent: Execute Task ("Find sunny cities in May")
|
||||
MyAgent->>MyAgent: Consult profile (Role, Goal, Backstory)
|
||||
MyAgent->>LLM: Formulate plan & Ask: "Best way to find sunny cities?"
|
||||
LLM-->>MyAgent: Suggestion: "Search web for 'Europe weather May'"
|
||||
MyAgent->>SearchTool: Use Tool(query="Europe weather May sunny cities")
|
||||
SearchTool-->>MyAgent: Web search results (e.g., Lisbon, Seville, Malta)
|
||||
MyAgent->>LLM: Consolidate results & Ask: "Format these 3 cities nicely"
|
||||
LLM-->>MyAgent: Formatted list: "1. Lisbon..."
|
||||
MyAgent-->>C: Task Result ("Here are 3 sunny cities: Lisbon...")
|
||||
|
||||
```
|
||||
|
||||
**Diving into the Code (`agent.py`)**
|
||||
|
||||
The core logic for the `Agent` resides in the `crewai/agent.py` file.
|
||||
|
||||
The `Agent` class itself inherits from `BaseAgent` (`crewai/agents/agent_builder/base_agent.py`) and primarily stores the configuration you provide:
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/agent.py
|
||||
from crewai.agents.agent_builder.base_agent import BaseAgent
|
||||
# ... other imports
|
||||
|
||||
class Agent(BaseAgent):
|
||||
role: str = Field(description="Role of the agent")
|
||||
goal: str = Field(description="Objective of the agent")
|
||||
backstory: str = Field(description="Backstory of the agent")
|
||||
llm: Any = Field(default=None, description="LLM instance")
|
||||
tools: Optional[List[BaseTool]] = Field(default_factory=list)
|
||||
allow_delegation: bool = Field(default=False)
|
||||
verbose: bool = Field(default=False)
|
||||
# ... other fields like memory, max_iter, etc.
|
||||
|
||||
def execute_task(
|
||||
self,
|
||||
task: Task,
|
||||
context: Optional[str] = None,
|
||||
tools: Optional[List[BaseTool]] = None,
|
||||
) -> str:
|
||||
# ... (steps 1 & 2: Prepare task prompt with context, memory, knowledge) ...
|
||||
|
||||
task_prompt = task.prompt() # Get base task description
|
||||
if context:
|
||||
task_prompt = f"{task_prompt}\nContext:\n{context}"
|
||||
# Add memory, knowledge, tool descriptions etc. to the prompt...
|
||||
|
||||
# ... (Internal setup: Create AgentExecutor if needed) ...
|
||||
self.create_agent_executor(tools=tools or self.tools)
|
||||
|
||||
# ... (Step 3-7: Run the execution loop via AgentExecutor) ...
|
||||
result = self.agent_executor.invoke({
|
||||
"input": task_prompt,
|
||||
"tool_names": self._get_tool_names(self.agent_executor.tools),
|
||||
"tools": self._get_tool_descriptions(self.agent_executor.tools),
|
||||
# ... other inputs for the executor ...
|
||||
})["output"] # Extract the final string output
|
||||
|
||||
return result
|
||||
|
||||
def create_agent_executor(self, tools: Optional[List[BaseTool]] = None) -> None:
|
||||
# Sets up the internal CrewAgentExecutor which handles the actual
|
||||
# interaction loop with the LLM and tools.
|
||||
# It uses the agent's profile (role, goal, backstory) to build the main prompt.
|
||||
pass
|
||||
|
||||
# ... other helper methods ...
|
||||
```
|
||||
|
||||
Key takeaways from the code:
|
||||
|
||||
* The `Agent` class mainly holds the configuration (`role`, `goal`, `backstory`, `llm`, `tools`, etc.).
|
||||
* The `execute_task` method is called by the `Crew` when it's the agent's turn.
|
||||
* It prepares a detailed prompt for the underlying LLM, incorporating the task, context, the agent's profile, and available tools.
|
||||
* It uses an internal object called `agent_executor` (specifically `CrewAgentExecutor` from `crewai/agents/crew_agent_executor.py`) to manage the actual step-by-step thinking, tool use, and response generation loop with the LLM.
|
||||
|
||||
You don't need to understand the `agent_executor` in detail right now, just know that it's the engine that drives the agent's execution based on the profile and task you provide.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've now met the core members of your AI team: the `Agent`s! You learned that each `Agent` is a specialized worker defined by its `role`, `goal`, and `backstory`. They use an [LLM](06_llm.md) as their brain and can be equipped with [Tools](04_tool.md) to perform specific actions.
|
||||
|
||||
We saw how to define an agent in code and got a glimpse into how they process information and execute the work assigned by the [Crew](01_crew.md).
|
||||
|
||||
But defining an `Agent` is only half the story. What specific work should they *do*? How do we describe the individual steps needed to achieve the `Crew`'s overall objective? That's where the next concept comes in: the [Task](03_task.md). Let's dive into defining the actual work!
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
272
docs/CrewAI/03_task.md
Normal file
272
docs/CrewAI/03_task.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# Chapter 3: Task - Defining the Work
|
||||
|
||||
In [Chapter 1](01_crew.md), we met the `Crew` - our AI team manager. In [Chapter 2](02_agent.md), we met the `Agent`s - our specialized AI workers. Now, we need to tell these agents *exactly* what to do. How do we give them specific assignments?
|
||||
|
||||
That's where the `Task` comes in!
|
||||
|
||||
## Why Do We Need Tasks?
|
||||
|
||||
Imagine our trip planning `Crew` again. We have a 'Travel Researcher' [Agent](02_agent.md) and an 'Activity Planner' [Agent](02_agent.md). Just having them isn't enough. We need to give them clear instructions:
|
||||
|
||||
* Researcher: "Find some sunny cities in Europe for May."
|
||||
* Planner: "Create a 3-day plan for the city the Researcher found."
|
||||
|
||||
These specific instructions are **`Task`s** in CrewAI. Instead of one vague goal, we break the project down into smaller, concrete steps.
|
||||
|
||||
**Problem Solved:** `Task` allows you to define individual, actionable assignments for your [Agent](02_agent.md)s. It turns a big goal into a manageable checklist.
|
||||
|
||||
## What is a Task?
|
||||
|
||||
Think of a `Task` as a **work order** or a **specific assignment** given to an [Agent](02_agent.md). It clearly defines what needs to be done and what the expected result should look like.
|
||||
|
||||
Here are the key ingredients of a `Task`:
|
||||
|
||||
1. **`description`**: This is the most important part! It's a clear and detailed explanation of *what* the [Agent](02_agent.md) needs to accomplish. The more specific, the better.
|
||||
2. **`expected_output`**: This tells the [Agent](02_agent.md) what a successful result should look like. It sets a clear target. Examples: "A list of 3 cities with pros and cons.", "A bulleted list of activities.", "A paragraph summarizing the key findings."
|
||||
3. **`agent`**: This specifies *which* [Agent](02_agent.md) in your [Crew](01_crew.md) is responsible for completing this task. Each task is typically assigned to the agent best suited for it.
|
||||
4. **`context`** (Optional but Important!): Tasks don't usually happen in isolation. A task might need information or results from *previous* tasks. The `context` allows the output of one task to be automatically fed as input/background information to the next task in a sequence.
|
||||
5. **`tools`** (Optional): You can specify a list of [Tools](04_tool.md) that the [Agent](02_agent.md) is *allowed* to use specifically for *this* task. This can be useful to restrict or grant specific capabilities for certain assignments.
|
||||
6. **`async_execution`** (Optional, Advanced): You can set this to `True` if you want the task to potentially run at the same time as other asynchronous tasks. We'll stick to synchronous (one after another) for now.
|
||||
7. **`output_json` / `output_pydantic`** (Optional, Advanced): If you need the task's final output in a structured format like JSON, you can specify a model here.
|
||||
8. **`output_file`** (Optional, Advanced): You can have the task automatically save its output to a file.
|
||||
|
||||
A `Task` bundles the instructions (`description`, `expected_output`) and assigns them to the right worker (`agent`), potentially giving them background info (`context`) and specific equipment (`tools`).
|
||||
|
||||
## Let's Define a Task!
|
||||
|
||||
Let's look again at the tasks we created for our trip planning [Crew](01_crew.md) in [Chapter 1](01_crew.md).
|
||||
|
||||
```python
|
||||
# Import necessary classes
|
||||
from crewai import Task, Agent # Assuming Agent class is defined as in Chapter 2
|
||||
|
||||
# Assume 'researcher' and 'planner' agents are already defined
|
||||
# researcher = Agent(role='Travel Researcher', ...)
|
||||
# planner = Agent(role='Activity Planner', ...)
|
||||
|
||||
# Define Task 1 for the Researcher
|
||||
task1 = Task(
|
||||
description=(
|
||||
"Identify the top 3 European cities known for great sunny weather "
|
||||
"around late May. Focus on cities with vibrant culture and good food."
|
||||
),
|
||||
expected_output=(
|
||||
"A numbered list of 3 cities, each with a brief (1-2 sentence) justification "
|
||||
"mentioning weather, culture, and food highlights."
|
||||
),
|
||||
agent=researcher # Assign this task to our researcher agent
|
||||
)
|
||||
|
||||
# Define Task 2 for the Planner
|
||||
task2 = Task(
|
||||
description=(
|
||||
"Using the list of cities provided by the researcher, select the best city "
|
||||
"and create a detailed 3-day itinerary. Include morning, afternoon, and "
|
||||
"evening activities, plus restaurant suggestions."
|
||||
),
|
||||
expected_output=(
|
||||
"A markdown formatted 3-day itinerary for the chosen city. "
|
||||
"Include timings, activity descriptions, and 2-3 restaurant ideas."
|
||||
),
|
||||
agent=planner # Assign this task to our planner agent
|
||||
# context=[task1] # Optionally explicitly define context (often handled automatically)
|
||||
)
|
||||
|
||||
# (You would then add these tasks to a Crew)
|
||||
# print(task1)
|
||||
# print(task2)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* `from crewai import Task`: We import the `Task` class.
|
||||
* `description=...`: We write a clear instruction for the agent. Notice how `task1` specifies the criteria (sunny, May, culture, food). `task2` explicitly mentions using the output from the previous task.
|
||||
* `expected_output=...`: We define what success looks like. `task1` asks for a numbered list with justifications. `task2` asks for a formatted itinerary. This helps the AI agent structure its response.
|
||||
* `agent=researcher` / `agent=planner`: We link each task directly to the [Agent](02_agent.md) responsible for doing the work.
|
||||
* `context=[task1]` (Commented Out): We *could* explicitly tell `task2` that it depends on `task1`. However, when using a `sequential` [Process](05_process.md) in the [Crew](01_crew.md), this dependency is usually handled automatically! The output of `task1` will be passed to `task2` as context.
|
||||
|
||||
Running this code creates `Task` objects, ready to be managed by a [Crew](01_crew.md).
|
||||
|
||||
## Task Workflow and Context: Connecting the Dots
|
||||
|
||||
Tasks are rarely standalone. They often form a sequence, where the result of one task is needed for the next. This is where `context` comes in.
|
||||
|
||||
Imagine our `Crew` is set up with a `sequential` [Process](05_process.md) (like in Chapter 1):
|
||||
|
||||
1. The `Crew` runs `task1` using the `researcher` agent.
|
||||
2. The `researcher` completes `task1` and produces an output (e.g., "1. Lisbon...", "2. Seville...", "3. Malta..."). This output is stored.
|
||||
3. The `Crew` moves to `task2`. Because it's sequential, it automatically takes the output from `task1` and provides it as *context* to `task2`.
|
||||
4. The `planner` agent receives `task2`'s description *and* the list of cities from `task1` as context.
|
||||
5. The `planner` uses this context to complete `task2` (e.g., creates an itinerary for Lisbon).
|
||||
|
||||
This automatic passing of information makes building workflows much easier!
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
A["Task 1: Find Cities (Agent: Researcher)"] -->|Output: Lisbon, Seville, Malta| B[Context for Task 2]
|
||||
B --> C["Task 2: Create Itinerary (Agent: Planner)"]
|
||||
C -->|Output: Lisbon Itinerary...| D[Final Result]
|
||||
|
||||
style A fill:#f9f,stroke:#333,stroke-width:2px
|
||||
style C fill:#f9f,stroke:#333,stroke-width:2px
|
||||
style B fill:#ccf,stroke:#333,stroke-width:1px,stroke-dasharray: 5 5
|
||||
style D fill:#cfc,stroke:#333,stroke-width:2px
|
||||
```
|
||||
|
||||
While the `sequential` process often handles context automatically, you *can* explicitly define dependencies using the `context` parameter in the `Task` definition if you need more control, especially with more complex workflows.
|
||||
|
||||
## How Does a Task Execute "Under the Hood"?
|
||||
|
||||
When the [Crew](01_crew.md)'s `kickoff()` method runs a task, here's a simplified view of what happens:
|
||||
|
||||
1. **Selection:** The [Crew](01_crew.md) (based on its [Process](05_process.md)) picks the next `Task` to execute.
|
||||
2. **Agent Assignment:** It identifies the `agent` assigned to this `Task`.
|
||||
3. **Context Gathering:** It collects the output from any prerequisite tasks (like the previous task in a sequential process) to form the `context`.
|
||||
4. **Execution Call:** The [Crew](01_crew.md) tells the assigned `Agent` to execute the `Task`, passing the `description`, `expected_output`, available `tools` (if any specified for the task), and the gathered `context`.
|
||||
5. **Agent Work:** The [Agent](02_agent.md) uses its configuration ([LLM](06_llm.md), backstory, etc.) and the provided information (task details, context, tools) to perform the work.
|
||||
6. **Result Return:** The [Agent](02_agent.md) generates the result and returns it as a `TaskOutput` object.
|
||||
7. **Output Storage:** The [Crew](01_crew.md) receives this `TaskOutput` and stores it, making it available as potential context for future tasks.
|
||||
|
||||
Let's visualize the interaction:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant C as Crew
|
||||
participant T1 as Task 1
|
||||
participant R_Agent as Researcher Agent
|
||||
participant T2 as Task 2
|
||||
participant P_Agent as Planner Agent
|
||||
|
||||
C->>T1: Prepare to Execute
|
||||
Note right of T1: Task 1 selected
|
||||
C->>R_Agent: Execute Task(T1.description, T1.expected_output)
|
||||
R_Agent->>R_Agent: Use LLM, Profile, Tools...
|
||||
R_Agent-->>C: Return TaskOutput (Cities List)
|
||||
C->>C: Store TaskOutput from T1
|
||||
|
||||
C->>T2: Prepare to Execute
|
||||
Note right of T2: Task 2 selected
|
||||
Note right of C: Get Context (Output from T1)
|
||||
C->>P_Agent: Execute Task(T2.description, T2.expected_output, context=T1_Output)
|
||||
P_Agent->>P_Agent: Use LLM, Profile, Tools, Context...
|
||||
P_Agent-->>C: Return TaskOutput (Itinerary)
|
||||
C->>C: Store TaskOutput from T2
|
||||
```
|
||||
|
||||
**Diving into the Code (`task.py`)**
|
||||
|
||||
The `Task` class itself is defined in `crewai/task.py`. It's primarily a container for the information you provide:
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/task.py
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import List, Optional, Type, Any
|
||||
# Import Agent and Tool placeholders for the example
|
||||
from crewai import BaseAgent, BaseTool
|
||||
|
||||
class TaskOutput(BaseModel): # Simplified representation of the result
|
||||
description: str
|
||||
raw: str
|
||||
agent: str
|
||||
# ... other fields like pydantic, json_dict
|
||||
|
||||
class Task(BaseModel):
|
||||
# Core attributes
|
||||
description: str = Field(description="Description of the actual task.")
|
||||
expected_output: str = Field(description="Clear definition of expected output.")
|
||||
agent: Optional[BaseAgent] = Field(default=None, description="Agent responsible.")
|
||||
|
||||
# Optional attributes
|
||||
context: Optional[List["Task"]] = Field(default=None, description="Context from other tasks.")
|
||||
tools: Optional[List[BaseTool]] = Field(default_factory=list, description="Task-specific tools.")
|
||||
async_execution: Optional[bool] = Field(default=False)
|
||||
output_json: Optional[Type[BaseModel]] = Field(default=None)
|
||||
output_pydantic: Optional[Type[BaseModel]] = Field(default=None)
|
||||
output_file: Optional[str] = Field(default=None)
|
||||
callback: Optional[Any] = Field(default=None) # Function to call after execution
|
||||
|
||||
# Internal state
|
||||
output: Optional[TaskOutput] = Field(default=None, description="Task output after execution")
|
||||
|
||||
def execute_sync(
|
||||
self,
|
||||
agent: Optional[BaseAgent] = None,
|
||||
context: Optional[str] = None,
|
||||
tools: Optional[List[BaseTool]] = None,
|
||||
) -> TaskOutput:
|
||||
# 1. Identify the agent to use (passed or self.agent)
|
||||
agent_to_execute = agent or self.agent
|
||||
if not agent_to_execute:
|
||||
raise Exception("No agent assigned to task.")
|
||||
|
||||
# 2. Prepare tools (task tools override agent tools if provided)
|
||||
execution_tools = tools or self.tools or agent_to_execute.tools
|
||||
|
||||
# 3. Call the agent's execute_task method
|
||||
# (The agent handles LLM calls, tool use, etc.)
|
||||
raw_result = agent_to_execute.execute_task(
|
||||
task=self, # Pass self (the task object)
|
||||
context=context,
|
||||
tools=execution_tools,
|
||||
)
|
||||
|
||||
# 4. Format the output
|
||||
# (Handles JSON/Pydantic conversion if requested)
|
||||
pydantic_output, json_output = self._export_output(raw_result)
|
||||
|
||||
# 5. Create and return TaskOutput object
|
||||
task_output = TaskOutput(
|
||||
description=self.description,
|
||||
raw=raw_result,
|
||||
pydantic=pydantic_output,
|
||||
json_dict=json_output,
|
||||
agent=agent_to_execute.role,
|
||||
# ... other fields
|
||||
)
|
||||
self.output = task_output # Store the output within the task object
|
||||
|
||||
# 6. Execute callback if defined
|
||||
if self.callback:
|
||||
self.callback(task_output)
|
||||
|
||||
# 7. Save to file if output_file is set
|
||||
if self.output_file:
|
||||
# ... logic to save file ...
|
||||
pass
|
||||
|
||||
return task_output
|
||||
|
||||
def prompt(self) -> str:
|
||||
# Combines description and expected output for the agent
|
||||
return f"{self.description}\n\nExpected Output:\n{self.expected_output}"
|
||||
|
||||
# ... other methods like execute_async, _export_output, _save_file ...
|
||||
```
|
||||
|
||||
Key takeaways from the code:
|
||||
|
||||
* The `Task` class holds the configuration (`description`, `expected_output`, `agent`, etc.).
|
||||
* The `execute_sync` (and `execute_async`) method orchestrates the execution *by calling the assigned agent's `execute_task` method*. The task itself doesn't contain the AI logic; it delegates that to the agent.
|
||||
* It takes the raw result from the agent and wraps it in a `TaskOutput` object, handling formatting (like JSON) and optional actions (callbacks, file saving).
|
||||
* The `prompt()` method shows how the core instructions are formatted before being potentially combined with context and tool descriptions by the agent.
|
||||
|
||||
## Advanced Task Features (A Quick Peek)
|
||||
|
||||
While we focused on the basics, `Task` has more capabilities:
|
||||
|
||||
* **Asynchronous Execution (`async_execution=True`):** Allows multiple tasks to run concurrently, potentially speeding up your Crew if tasks don't strictly depend on each other's immediate output.
|
||||
* **Structured Outputs (`output_json`, `output_pydantic`):** Force the agent to return data in a specific Pydantic model or JSON structure, making it easier to use the output programmatically.
|
||||
* **File Output (`output_file='path/to/output.txt'`):** Automatically save the task's result to a specified file.
|
||||
* **Conditional Tasks (`ConditionalTask`):** A special type of task (defined in `crewai.tasks.conditional_task`) that only runs if a specific condition (based on the previous task's output) is met. This allows for branching logic in your workflows.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've now learned about the `Task` – the fundamental unit of work in CrewAI. A `Task` defines *what* needs to be done (`description`), what the result should look like (`expected_output`), and *who* should do it (`agent`). Tasks are the building blocks of your Crew's plan, and their outputs often flow as `context` to subsequent tasks, creating powerful workflows.
|
||||
|
||||
We've seen how to define Agents and give them Tasks. But what if an agent needs a specific ability, like searching the internet, calculating something, or reading a specific document? How do we give our agents superpowers? That's where [Tools](04_tool.md) come in! Let's explore them in the next chapter.
|
||||
|
||||
**Next:** [Chapter 4: Tool - Equipping Your Agents](04_tool.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
273
docs/CrewAI/04_tool.md
Normal file
273
docs/CrewAI/04_tool.md
Normal file
@@ -0,0 +1,273 @@
|
||||
# Chapter 4: Tool - Equipping Your Agents
|
||||
|
||||
In [Chapter 3: Task](03_task.md), we learned how to define specific assignments (`Task`s) for our AI `Agent`s. We told the 'Travel Researcher' agent to find sunny cities and the 'Activity Planner' agent to create an itinerary.
|
||||
|
||||
But wait... how does the 'Travel Researcher' actually *find* those cities? Can it browse the web? Can it look at weather data? By default, an [Agent](02_agent.md)'s "brain" ([LLM](06_llm.md)) is great at reasoning and generating text based on the information it already has, but it can't interact with the outside world on its own.
|
||||
|
||||
This is where `Tool`s come in! They are the **special equipment and abilities** we give our agents to make them more capable.
|
||||
|
||||
## Why Do We Need Tools?
|
||||
|
||||
Imagine you hire a brilliant researcher. They can think, analyze, and write reports. But if their task is "Find the best coffee shop near me right now," they need specific tools: maybe a map application, a business directory, or a review website. Without these tools, they can only guess or rely on outdated knowledge.
|
||||
|
||||
Similarly, our AI [Agent](02_agent.md)s need `Tool`s to perform actions beyond simple text generation.
|
||||
|
||||
* Want your agent to find current information? Give it a **web search tool**.
|
||||
* Need it to perform calculations? Give it a **calculator tool**.
|
||||
* Want it to read a specific document? Give it a **file reading tool**.
|
||||
* Need it to ask another agent for help? Use the built-in **delegation tool** ([AgentTools](tools/agent_tools/agent_tools.py)).
|
||||
|
||||
**Problem Solved:** `Tool`s extend an [Agent](02_agent.md)'s capabilities beyond its built-in knowledge, allowing it to interact with external systems, perform specific computations, or access real-time information.
|
||||
|
||||
## What is a Tool?
|
||||
|
||||
Think of a `Tool` as a **function or capability** that an [Agent](02_agent.md) can choose to use while working on a [Task](03_task.md). Each `Tool` has a few key parts:
|
||||
|
||||
1. **`name`**: A short, unique name for the tool (e.g., `web_search`, `calculator`).
|
||||
2. **`description`**: This is **very important**! It tells the [Agent](02_agent.md) *what the tool does* and *when it should be used*. The agent's [LLM](06_llm.md) reads this description to decide if the tool is appropriate for the current step of its task. A good description is crucial for the agent to use the tool correctly. Example: "Useful for searching the internet for current events or information."
|
||||
3. **`args_schema`** (Optional): Defines the inputs the tool needs to work. For example, a `web_search` tool would likely need a `query` argument (the search term). This is often defined using Pydantic models.
|
||||
4. **`_run` method**: This is the actual code that gets executed when the agent uses the tool. It takes the arguments defined in `args_schema` and performs the action (like calling a search API or performing a calculation).
|
||||
|
||||
Agents are given a list of `Tool`s they are allowed to use. When an agent is working on a task, its internal thought process might lead it to conclude that it needs a specific capability. It will then look through its available tools, read their descriptions, and if it finds a match, it will figure out the necessary arguments and execute the tool's `_run` method.
|
||||
|
||||
## Equipping an Agent with a Tool
|
||||
|
||||
CrewAI integrates with many existing toolkits, like `crewai_tools` (install separately: `pip install 'crewai[tools]'`). Let's give our 'Travel Researcher' agent a web search tool. We'll use `SerperDevTool` as an example, which uses the Serper.dev API for Google Search results.
|
||||
|
||||
*(Note: Using tools like this often requires API keys. You'll need to sign up for Serper.dev and set the `SERPER_API_KEY` environment variable for this specific example to run.)*
|
||||
|
||||
```python
|
||||
# Make sure you have crewai and crewai_tools installed
|
||||
# pip install crewai crewai_tools
|
||||
|
||||
import os
|
||||
from crewai import Agent
|
||||
from crewai_tools import SerperDevTool
|
||||
|
||||
# Set up your API key (replace with your actual key or environment variable setup)
|
||||
# IMPORTANT: Do NOT hardcode keys in production code! Use environment variables.
|
||||
# os.environ["SERPER_API_KEY"] = "YOUR_SERPER_API_KEY"
|
||||
|
||||
# 1. Instantiate the tool
|
||||
# (It automatically gets a name and description)
|
||||
search_tool = SerperDevTool()
|
||||
|
||||
# 2. Define the agent and provide the tool in the 'tools' list
|
||||
researcher = Agent(
|
||||
role='Expert Travel Researcher',
|
||||
goal='Find the three most exciting and sunny European cities for a birthday trip in late May.',
|
||||
backstory=(
|
||||
"You are a world-class travel researcher with deep knowledge of "
|
||||
"European destinations. You excel at finding hidden gems and understanding "
|
||||
"weather patterns. Your recommendations are always insightful and tailored. "
|
||||
"You MUST use the web search tool to find the most up-to-date information." # Added instruction
|
||||
),
|
||||
verbose=True,
|
||||
allow_delegation=False,
|
||||
tools=[search_tool] # <-- Give the agent access to the tool!
|
||||
# llm=your_llm # Assumes an LLM is configured (see Chapter 6)
|
||||
)
|
||||
|
||||
# (You would then create a Task and a Crew as shown in previous chapters)
|
||||
# task1 = Task(description="Find 3 sunny European cities for May...", agent=researcher, ...)
|
||||
# trip_crew = Crew(agents=[researcher], tasks=[task1], ...)
|
||||
# result = trip_crew.kickoff()
|
||||
# print(result)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
1. We import `SerperDevTool` from `crewai_tools`.
|
||||
2. We create an instance of the tool: `search_tool = SerperDevTool()`. This object now holds the `name` ("Search"), `description` ("A tool that can be used to search the internet..."), and the `_run` logic to call the Serper API.
|
||||
3. When defining our `researcher` [Agent](02_agent.md), we pass a list containing `search_tool` to the `tools` parameter: `tools=[search_tool]`.
|
||||
4. We also updated the agent's `backstory` to explicitly encourage using the tool, which can sometimes help guide the agent.
|
||||
|
||||
**Expected Outcome (Conceptual):**
|
||||
|
||||
When this `researcher` agent runs a [Task](03_task.md) like "Find 3 sunny European cities for May...", because `verbose=True`, you would see its thought process. It might look something like this (simplified):
|
||||
|
||||
```
|
||||
Thought: I need to find sunny European cities for May. The best way to get current information is to search the web. I have a 'Search' tool available. I should use it.
|
||||
Action: Search
|
||||
Action Input: {"query": "best sunny European cities May weather culture food"}
|
||||
|
||||
[... Agent waits for the tool to run ...]
|
||||
|
||||
Observation: [Search results mentioning Lisbon, Seville, Malta, Athens, etc. with details]
|
||||
|
||||
Thought: Okay, the search results suggest Lisbon, Seville, and Malta are good options based on sun, culture, and food. I will summarize these findings as requested.
|
||||
Final Answer: Here are the top 3 sunny European cities for May... 1. Lisbon... 2. Seville... 3. Malta...
|
||||
```
|
||||
|
||||
The agent used the tool's `description` to know when to use it, formulated the necessary input (`query`), executed the tool, received the `Observation` (the tool's output), and then used that information to generate its `Final Answer`.
|
||||
|
||||
## How Tools Work "Under the Hood"
|
||||
|
||||
When an [Agent](02_agent.md) equipped with tools runs a [Task](03_task.md), a fascinating interaction happens between the Agent, its [LLM](06_llm.md) brain, and the Tools.
|
||||
|
||||
1. **Task Received:** The Agent gets the task description and any context.
|
||||
2. **Initial Thought:** The Agent's [LLM](06_llm.md) thinks about the task and its profile (`role`, `goal`, `backstory`). It formulates an initial plan.
|
||||
3. **Need for Capability:** The LLM might realize it needs information it doesn't have (e.g., "What's the weather like *right now*?") or needs to perform an action (e.g., "Calculate 5 factorial").
|
||||
4. **Tool Selection:** The Agent provides its [LLM](06_llm.md) with the list of available `Tool`s, including their `name`s and crucially, their `description`s. The LLM checks if any tool description matches the capability it needs.
|
||||
5. **Tool Invocation Decision:** If the LLM finds a suitable tool (e.g., it needs to search, and finds the `Search` tool whose description says "Useful for searching the internet"), it decides to use it. It outputs a special message indicating the tool name and the arguments (based on the tool's `args_schema`).
|
||||
6. **Tool Execution:** The CrewAI framework intercepts this special message. It finds the corresponding `Tool` object and calls its `run()` method, passing the arguments the LLM provided.
|
||||
7. **Action Performed:** The tool's `_run()` method executes its code (e.g., calls an external API, runs a calculation).
|
||||
8. **Result Returned:** The tool's `_run()` method returns its result (e.g., the text of the search results, the calculated number).
|
||||
9. **Observation Provided:** The CrewAI framework takes the tool's result and feeds it back to the Agent's [LLM](06_llm.md) as an "Observation".
|
||||
10. **Continued Thought:** The LLM now has new information from the tool. It incorporates this observation into its thinking and continues working on the task, potentially deciding to use another tool or generate the final answer.
|
||||
|
||||
Let's visualize this flow for our researcher using the search tool:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Agent
|
||||
participant LLM as Agent's Brain
|
||||
participant ST as Search Tool
|
||||
|
||||
A->>LLM: Task: "Find sunny cities..." Plan?
|
||||
LLM-->>A: Plan: Need current info. Search web for "sunny European cities May".
|
||||
A->>A: Check tools: Found 'Search' tool (description matches).
|
||||
A->>LLM: Format request for 'Search' tool. Query?
|
||||
LLM-->>A: Output: Use Tool 'Search' with args {"query": "sunny European cities May"}
|
||||
A->>ST: run(query="sunny European cities May")
|
||||
Note right of ST: ST._run() calls Serper API...
|
||||
ST-->>A: Return results: "Lisbon (Sunny...), Seville (Hot...), Malta (Warm...)"
|
||||
A->>LLM: Observation: Got results "Lisbon...", "Seville...", "Malta..."
|
||||
LLM-->>A: Thought: Use these results to formulate the final list.
|
||||
LLM-->>A: Final Answer: "Based on recent web search, the top cities are..."
|
||||
```
|
||||
|
||||
**Diving into the Code (`tools/base_tool.py`)**
|
||||
|
||||
The foundation for all tools is the `BaseTool` class (found in `crewai/tools/base_tool.py`). When you use a pre-built tool or create your own, it typically inherits from this class.
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/tools/base_tool.py
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Type, Optional, Any
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
class BaseTool(BaseModel, ABC):
|
||||
# Configuration for the tool
|
||||
name: str = Field(description="The unique name of the tool.")
|
||||
description: str = Field(description="What the tool does, how/when to use it.")
|
||||
args_schema: Optional[Type[BaseModel]] = Field(
|
||||
default=None, description="Pydantic schema for the tool's arguments."
|
||||
)
|
||||
# ... other options like caching ...
|
||||
|
||||
# This method contains the actual logic
|
||||
@abstractmethod
|
||||
def _run(self, *args: Any, **kwargs: Any) -> Any:
|
||||
"""The core implementation of the tool's action."""
|
||||
pass
|
||||
|
||||
# This method is called by the agent execution framework
|
||||
def run(self, *args: Any, **kwargs: Any) -> Any:
|
||||
"""Executes the tool's core logic."""
|
||||
# Could add logging, error handling, caching calls here
|
||||
print(f"----- Executing Tool: {self.name} -----") # Example logging
|
||||
result = self._run(*args, **kwargs)
|
||||
print(f"----- Tool {self.name} Finished -----")
|
||||
return result
|
||||
|
||||
# Helper method to generate a structured description for the LLM
|
||||
def _generate_description(self):
|
||||
# Creates a detailed description including name, args, and description
|
||||
# This is what the LLM sees to decide if it should use the tool
|
||||
pass
|
||||
|
||||
# ... other helper methods ...
|
||||
|
||||
# You can create a simple tool using the 'Tool' class directly
|
||||
# or inherit from BaseTool for more complex logic.
|
||||
from typing import Type
|
||||
|
||||
class SimpleTool(BaseTool):
|
||||
name: str = "MySimpleTool"
|
||||
description: str = "A very simple example tool."
|
||||
# No args_schema needed if it takes no arguments
|
||||
|
||||
def _run(self) -> str:
|
||||
return "This simple tool was executed successfully!"
|
||||
|
||||
```
|
||||
|
||||
Key takeaways:
|
||||
|
||||
* `BaseTool` requires `name` and `description`.
|
||||
* `args_schema` defines the expected input structure (using Pydantic).
|
||||
* The actual logic lives inside the `_run` method.
|
||||
* The `run` method is the entry point called by the framework.
|
||||
* The framework (`crewai/tools/tool_usage.py` and `crewai/agents/executor.py`) handles the complex part: presenting tools to the LLM, parsing the LLM's decision to use a tool, calling `tool.run()`, and feeding the result back.
|
||||
|
||||
A special mention goes to `AgentTools` (`crewai/tools/agent_tools/agent_tools.py`), which provides tools like `Delegate work to coworker` and `Ask question to coworker`, enabling agents within a [Crew](01_crew.md) to collaborate.
|
||||
|
||||
## Creating Your Own Simple Tool (Optional)
|
||||
|
||||
While CrewAI offers many pre-built tools, sometimes you need a custom one. Let's create a *very* basic calculator.
|
||||
|
||||
```python
|
||||
from crewai.tools import BaseTool
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Type
|
||||
import math # Using math module for safety
|
||||
|
||||
# 1. Define the input schema using Pydantic
|
||||
class CalculatorInput(BaseModel):
|
||||
expression: str = Field(description="The mathematical expression to evaluate (e.g., '2 + 2 * 4').")
|
||||
|
||||
# 2. Create the Tool class, inheriting from BaseTool
|
||||
class CalculatorTool(BaseTool):
|
||||
name: str = "Calculator"
|
||||
description: str = "Useful for evaluating simple mathematical expressions involving numbers, +, -, *, /, and parentheses."
|
||||
args_schema: Type[BaseModel] = CalculatorInput # Link the input schema
|
||||
|
||||
def _run(self, expression: str) -> str:
|
||||
"""Evaluates the mathematical expression."""
|
||||
allowed_chars = "0123456789+-*/(). "
|
||||
if not all(c in allowed_chars for c in expression):
|
||||
return "Error: Expression contains invalid characters."
|
||||
|
||||
try:
|
||||
# VERY IMPORTANT: eval() is dangerous with arbitrary user input.
|
||||
# In a real application, use a safer parsing library like 'numexpr' or build your own parser.
|
||||
# This is a simplified example ONLY.
|
||||
result = eval(expression, {"__builtins__": None}, {"math": math}) # Safer eval
|
||||
return f"The result of '{expression}' is {result}"
|
||||
except Exception as e:
|
||||
return f"Error evaluating expression '{expression}': {e}"
|
||||
|
||||
# 3. Instantiate and use it in an agent
|
||||
calculator = CalculatorTool()
|
||||
|
||||
math_agent = Agent(
|
||||
role='Math Whiz',
|
||||
goal='Calculate the results of mathematical expressions accurately.',
|
||||
backstory='You are an expert mathematician agent.',
|
||||
tools=[calculator], # Give the agent the calculator
|
||||
verbose=True
|
||||
)
|
||||
|
||||
# Example Task for this agent:
|
||||
# math_task = Task(description="What is the result of (5 + 3) * 6 / 2?", agent=math_agent)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
1. We define `CalculatorInput` using Pydantic to specify that the tool needs an `expression` string. The `description` here helps the LLM understand what kind of string to provide.
|
||||
2. We create `CalculatorTool` inheriting from `BaseTool`. We set `name`, `description`, and link `args_schema` to our `CalculatorInput`.
|
||||
3. The `_run` method takes the `expression` string. We added a basic safety check and used a slightly safer version of `eval`. **Again, `eval` is generally unsafe; prefer dedicated math parsing libraries in production.** It returns the result as a string.
|
||||
4. We can now instantiate `CalculatorTool()` and add it to an agent's `tools` list.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned about `Tool`s – the essential equipment that gives your AI [Agent](02_agent.md)s superpowers! Tools allow agents to perform actions like searching the web, doing calculations, or interacting with other systems, making them vastly more useful than agents that can only generate text. We saw how to equip an agent with pre-built tools and even how to create a simple custom tool by defining its `name`, `description`, `args_schema`, and `_run` method. The `description` is key for the agent to know when and how to use its tools effectively.
|
||||
|
||||
Now that we have Agents equipped with Tools and assigned Tasks, how does the whole [Crew](01_crew.md) actually coordinate the work? Do agents work one after another? Is there a manager? That's determined by the `Process`. Let's explore that next!
|
||||
|
||||
**Next:** [Chapter 5: Process - Orchestrating the Workflow](05_process.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
297
docs/CrewAI/05_process.md
Normal file
297
docs/CrewAI/05_process.md
Normal file
@@ -0,0 +1,297 @@
|
||||
# Chapter 5: Process - Orchestrating the Workflow
|
||||
|
||||
In [Chapter 4: Tool](04_tool.md), we learned how to give our [Agent](02_agent.md)s special abilities using `Tool`s, like searching the web. Now we have specialized agents, defined tasks, and equipped agents. But how do they actually *work together*? Does Agent 1 finish its work before Agent 2 starts? Or is there a manager overseeing everything?
|
||||
|
||||
This coordination is handled by the **`Process`**.
|
||||
|
||||
## Why Do We Need a Process?
|
||||
|
||||
Think back to our trip planning [Crew](01_crew.md). We have a 'Travel Researcher' agent and an 'Activity Planner' agent.
|
||||
|
||||
* **Scenario 1:** Maybe the Researcher needs to find the city *first*, and *then* the Planner creates the itinerary for that specific city. The work happens in a specific order.
|
||||
* **Scenario 2:** Maybe we have a more complex project with many agents (Researcher, Planner, Booker, Budgeter). Perhaps we want a 'Project Manager' agent to receive the main goal, decide which agent needs to do what first, review their work, and then assign the next step.
|
||||
|
||||
The way the agents collaborate and the order in which [Task](03_task.md)s are executed is crucial for success. A well-defined `Process` ensures work flows smoothly and efficiently.
|
||||
|
||||
**Problem Solved:** `Process` defines the strategy or workflow the [Crew](01_crew.md) uses to execute its [Task](03_task.md)s. It dictates how [Agent](02_agent.md)s collaborate and how information moves between them.
|
||||
|
||||
## What is a Process?
|
||||
|
||||
Think of the `Process` as the **project management style** for your [Crew](01_crew.md). It determines the overall flow of work. CrewAI primarily supports two types of processes:
|
||||
|
||||
1. **`Process.sequential`**:
|
||||
* **Analogy:** Like following a recipe or a checklist.
|
||||
* **How it works:** Tasks are executed one after another, in the exact order you list them in the `Crew` definition. The output of the first task automatically becomes available as context for the second task, the output of the second for the third, and so on.
|
||||
* **Best for:** Simple, linear workflows where each step clearly follows the previous one.
|
||||
|
||||
2. **`Process.hierarchical`**:
|
||||
* **Analogy:** Like a traditional company structure with a manager.
|
||||
* **How it works:** You designate a "manager" [Agent](02_agent.md) (usually by providing a specific `manager_llm` or a custom `manager_agent` to the `Crew`). This manager receives the overall goal and the list of tasks. It then analyzes the tasks and decides which *worker* agent should perform which task, potentially breaking them down or reordering them. The manager delegates work, reviews results, and coordinates the team until the goal is achieved.
|
||||
* **Best for:** More complex projects where task order might change, delegation is needed, or a central coordinator can optimize the workflow.
|
||||
|
||||
Choosing the right `Process` is key to structuring how your agents interact.
|
||||
|
||||
## How to Use Process
|
||||
|
||||
You define the process when you create your `Crew`, using the `process` parameter.
|
||||
|
||||
### Sequential Process
|
||||
|
||||
This is the default and simplest process. We already used it in [Chapter 1](01_crew.md)!
|
||||
|
||||
```python
|
||||
# Assuming 'researcher' and 'planner' agents are defined (from Chapter 2)
|
||||
# Assuming 'task1' (find cities) and 'task2' (create itinerary) are defined (from Chapter 3)
|
||||
# task1 assigned to researcher, task2 assigned to planner
|
||||
|
||||
from crewai import Crew, Process
|
||||
|
||||
# Define the crew with a sequential process
|
||||
trip_crew = Crew(
|
||||
agents=[researcher, planner],
|
||||
tasks=[task1, task2],
|
||||
process=Process.sequential # Explicitly setting the sequential process
|
||||
# verbose=2 # Optional verbosity
|
||||
)
|
||||
|
||||
# Start the work
|
||||
# result = trip_crew.kickoff()
|
||||
# print(result)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We import `Crew` and `Process`.
|
||||
* When creating the `trip_crew`, we pass our list of `agents` and `tasks`.
|
||||
* We set `process=Process.sequential`.
|
||||
* When `kickoff()` is called:
|
||||
1. `task1` (Find Cities) is executed by the `researcher`.
|
||||
2. The output of `task1` (the list of cities) is automatically passed as context.
|
||||
3. `task2` (Create Itinerary) is executed by the `planner`, using the cities list from `task1`.
|
||||
4. The final output of `task2` is returned.
|
||||
|
||||
It's simple and predictable: Task 1 -> Task 2 -> Done.
|
||||
|
||||
### Hierarchical Process
|
||||
|
||||
For this process, the `Crew` needs a manager. You usually specify the language model the manager should use (`manager_llm`). The manager agent is created internally by CrewAI using this LLM.
|
||||
|
||||
```python
|
||||
# Assuming 'researcher' and 'planner' agents are defined
|
||||
# Assuming 'task1' and 'task2' are defined (WITHOUT necessarily assigning agents initially)
|
||||
# You need an LLM configured (e.g., from OpenAI, Ollama - see Chapter 6)
|
||||
# from langchain_openai import ChatOpenAI # Example LLM
|
||||
|
||||
from crewai import Crew, Process, Task
|
||||
|
||||
# Example tasks (agent assignment might be handled by the manager)
|
||||
task1 = Task(description='Find top 3 European cities for a sunny May birthday trip.', expected_output='List of 3 cities with justifications.')
|
||||
task2 = Task(description='Create a 3-day itinerary for the best city found.', expected_output='Detailed 3-day plan.')
|
||||
|
||||
# Define the crew with a hierarchical process and a manager LLM
|
||||
hierarchical_crew = Crew(
|
||||
agents=[researcher, planner], # The worker agents
|
||||
tasks=[task1, task2], # The tasks to be managed
|
||||
process=Process.hierarchical, # Set the process to hierarchical
|
||||
manager_llm=ChatOpenAI(model="gpt-4") # Specify the LLM for the manager agent
|
||||
# You could also provide a pre-configured manager_agent instance instead of manager_llm
|
||||
)
|
||||
|
||||
# Start the work
|
||||
# result = hierarchical_crew.kickoff()
|
||||
# print(result)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We set `process=Process.hierarchical`.
|
||||
* We provide a list of worker `agents` (`researcher`, `planner`).
|
||||
* We provide the `tasks` that need to be accomplished. Note that for the hierarchical process, you *might* not need to assign agents directly to tasks, as the manager can decide who is best suited. However, assigning them can still provide hints to the manager.
|
||||
* Crucially, we provide `manager_llm`. CrewAI will use this LLM to create an internal 'Manager Agent'. This agent's implicit goal is to orchestrate the `agents` to complete the `tasks`.
|
||||
* When `kickoff()` is called:
|
||||
1. The internal Manager Agent analyzes `task1` and `task2` and the available agents (`researcher`, `planner`).
|
||||
2. It decides which agent should do `task1` (likely the `researcher`). It delegates the task using internal tools (like `AgentTools`).
|
||||
3. It receives the result from the `researcher`.
|
||||
4. It analyzes the result and decides the next step – likely delegating `task2` to the `planner`, providing the context from `task1`.
|
||||
5. It receives the result from the `planner`.
|
||||
6. Once all tasks are deemed complete by the manager, it compiles and returns the final result.
|
||||
|
||||
This process is more dynamic, allowing the manager to adapt the workflow.
|
||||
|
||||
## How Process Works "Under the Hood"
|
||||
|
||||
When you call `crew.kickoff()`, the first thing the `Crew` does is check its `process` attribute to determine the execution strategy.
|
||||
|
||||
1. **Input & Setup:** `kickoff()` prepares the agents and tasks, interpolating any initial inputs.
|
||||
2. **Process Check:** It looks at `crew.process`.
|
||||
3. **Execution Path:**
|
||||
* If `Process.sequential`, it calls an internal method like `_run_sequential_process()`.
|
||||
* If `Process.hierarchical`, it first ensures a manager agent exists (creating one if `manager_llm` was provided) and then calls a method like `_run_hierarchical_process()`.
|
||||
4. **Task Loop (Sequential):** `_run_sequential_process()` iterates through the `tasks` list in order. For each task, it finds the assigned agent, gathers context from the *previous* task's output, and asks the agent to execute the task.
|
||||
5. **Managed Execution (Hierarchical):** `_run_hierarchical_process()` delegates control to the manager agent. The manager agent, using its LLM and specialized delegation tools (like `AgentTools`), decides which task to tackle next and which worker agent to assign it to. It manages the flow until all tasks are completed.
|
||||
6. **Output:** The final result (usually the output of the last task) is packaged and returned.
|
||||
|
||||
### Visualization
|
||||
|
||||
Let's visualize the difference:
|
||||
|
||||
**Sequential Process:**
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant MyCrew as Crew (Sequential)
|
||||
participant ResearcherAgent as Researcher
|
||||
participant PlannerAgent as Planner
|
||||
|
||||
User->>MyCrew: kickoff()
|
||||
MyCrew->>ResearcherAgent: Execute Task 1 ("Find cities")
|
||||
ResearcherAgent-->>MyCrew: Task 1 Output (Cities List)
|
||||
MyCrew->>PlannerAgent: Execute Task 2 ("Create itinerary")\nwith Task 1 Output context
|
||||
PlannerAgent-->>MyCrew: Task 2 Output (Itinerary)
|
||||
MyCrew-->>User: Final Result (Task 2 Output)
|
||||
```
|
||||
|
||||
**Hierarchical Process:**
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant MyCrew as Crew (Hierarchical)
|
||||
participant ManagerAgent as Manager
|
||||
participant ResearcherAgent as Researcher
|
||||
participant PlannerAgent as Planner
|
||||
|
||||
User->>MyCrew: kickoff()
|
||||
MyCrew->>ManagerAgent: Goal: Plan Trip (Tasks: Find Cities, Create Itinerary)
|
||||
ManagerAgent->>ManagerAgent: Decide: Researcher should do Task 1
|
||||
ManagerAgent->>ResearcherAgent: Delegate: Execute Task 1 ("Find cities")
|
||||
ResearcherAgent-->>ManagerAgent: Task 1 Output (Cities List)
|
||||
ManagerAgent->>ManagerAgent: Decide: Planner should do Task 2 with context
|
||||
ManagerAgent->>PlannerAgent: Delegate: Execute Task 2 ("Create itinerary", Cities List)
|
||||
PlannerAgent-->>ManagerAgent: Task 2 Output (Itinerary)
|
||||
ManagerAgent->>MyCrew: Report Final Result (Itinerary)
|
||||
MyCrew-->>User: Final Result (Itinerary)
|
||||
```
|
||||
|
||||
### Diving into the Code (`crew.py`)
|
||||
|
||||
The `Crew` class in `crewai/crew.py` holds the logic.
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/crew.py
|
||||
from crewai.process import Process
|
||||
from crewai.task import Task
|
||||
from crewai.agents.agent_builder.base_agent import BaseAgent
|
||||
# ... other imports
|
||||
|
||||
class Crew(BaseModel):
|
||||
# ... other fields like agents, tasks ...
|
||||
process: Process = Field(default=Process.sequential)
|
||||
manager_llm: Optional[Any] = Field(default=None)
|
||||
manager_agent: Optional[BaseAgent] = Field(default=None)
|
||||
# ... other fields ...
|
||||
|
||||
@model_validator(mode="after")
|
||||
def check_manager_llm(self):
|
||||
# Ensures manager_llm or manager_agent is set for hierarchical process
|
||||
if self.process == Process.hierarchical:
|
||||
if not self.manager_llm and not self.manager_agent:
|
||||
raise PydanticCustomError(
|
||||
"missing_manager_llm_or_manager_agent",
|
||||
"Attribute `manager_llm` or `manager_agent` is required when using hierarchical process.",
|
||||
{},
|
||||
)
|
||||
return self
|
||||
|
||||
def kickoff(self, inputs: Optional[Dict[str, Any]] = None) -> CrewOutput:
|
||||
# ... setup, input interpolation, callback setup ...
|
||||
|
||||
# THE CORE DECISION BASED ON PROCESS:
|
||||
if self.process == Process.sequential:
|
||||
result = self._run_sequential_process()
|
||||
elif self.process == Process.hierarchical:
|
||||
# Ensure manager is ready before running
|
||||
self._create_manager_agent() # Creates manager if needed
|
||||
result = self._run_hierarchical_process()
|
||||
else:
|
||||
raise NotImplementedError(f"Process '{self.process}' not implemented.")
|
||||
|
||||
# ... calculate usage metrics, final formatting ...
|
||||
return result
|
||||
|
||||
def _run_sequential_process(self) -> CrewOutput:
|
||||
task_outputs = []
|
||||
for task_index, task in enumerate(self.tasks):
|
||||
agent = task.agent # Get assigned agent
|
||||
# ... handle conditional tasks, async tasks ...
|
||||
context = self._get_context(task, task_outputs) # Get previous output
|
||||
output = task.execute_sync(agent=agent, context=context) # Run task
|
||||
task_outputs.append(output)
|
||||
# ... logging/callbacks ...
|
||||
return self._create_crew_output(task_outputs)
|
||||
|
||||
def _run_hierarchical_process(self) -> CrewOutput:
|
||||
# This actually delegates the orchestration to the manager agent.
|
||||
# The manager agent uses its LLM and tools (AgentTools)
|
||||
# to call the worker agents sequentially or in parallel as it sees fit.
|
||||
manager = self.manager_agent
|
||||
# Simplified concept: Manager executes a "meta-task"
|
||||
# whose goal is to complete the crew's tasks using available agents.
|
||||
# The actual implementation involves the manager agent's execution loop.
|
||||
return self._execute_tasks(self.tasks) # The manager guides this execution internally
|
||||
|
||||
def _create_manager_agent(self):
|
||||
# Logic to setup the self.manager_agent instance, either using
|
||||
# the provided self.manager_agent or creating a default one
|
||||
# using self.manager_llm and AgentTools(agents=self.agents).
|
||||
if self.manager_agent is None and self.manager_llm:
|
||||
# Simplified: Create a default manager agent here
|
||||
# It gets tools to delegate work to self.agents
|
||||
self.manager_agent = Agent(
|
||||
role="Crew Manager",
|
||||
goal="Coordinate the crew to achieve their goals.",
|
||||
backstory="An expert project manager.",
|
||||
llm=self.manager_llm,
|
||||
tools=AgentTools(agents=self.agents).tools(), # Gives it delegation capability
|
||||
allow_delegation=True, # Must be true for manager
|
||||
verbose=self.verbose
|
||||
)
|
||||
self.manager_agent.crew = self # Link back to crew
|
||||
# Ensure manager has necessary setup...
|
||||
pass
|
||||
|
||||
def _execute_tasks(self, tasks: List[Task], ...) -> CrewOutput:
|
||||
"""Internal method used by both sequential and hierarchical processes
|
||||
to iterate through tasks. In hierarchical, the manager agent influences
|
||||
which agent runs which task via delegation tools."""
|
||||
# ... loops through tasks, gets agent (directly for seq, via manager for hier), executes ...
|
||||
pass
|
||||
# ... other helper methods like _get_context, _create_crew_output ...
|
||||
|
||||
```
|
||||
|
||||
Key takeaways from the code:
|
||||
|
||||
* The `Crew` stores the `process` type (`sequential` or `hierarchical`).
|
||||
* A validation (`check_manager_llm`) ensures a manager (`manager_llm` or `manager_agent`) is provided if `process` is `hierarchical`.
|
||||
* The `kickoff` method explicitly checks `self.process` to decide which internal execution method (`_run_sequential_process` or `_run_hierarchical_process`) to call.
|
||||
* `_run_sequential_process` iterates through tasks in order.
|
||||
* `_run_hierarchical_process` relies on the `manager_agent` (created by `_create_manager_agent` if needed) to manage the task execution flow, often using delegation tools.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've now learned about the `Process` - the crucial setting that defines *how* your [Crew](01_crew.md) collaborates.
|
||||
|
||||
* **`Sequential`** is like a checklist: tasks run one by one, in order, with outputs flowing directly to the next task. Simple and predictable.
|
||||
* **`Hierarchical`** is like having a manager: a dedicated manager [Agent](02_agent.md) coordinates the worker agents, deciding who does what and when. More flexible for complex workflows.
|
||||
|
||||
Choosing the right process helps structure your agent interactions effectively.
|
||||
|
||||
So far, we've built the team ([Agent](02_agent.md)), defined the work ([Task](03_task.md)), given them abilities ([Tool](04_tool.md)), and decided on the workflow ([Process](05_process.md)). But what powers the "thinking" part of each agent? What is the "brain" that understands roles, goals, backstories, and uses tools? That's the Large Language Model, or [LLM](06_llm.md). Let's dive into that next!
|
||||
|
||||
**Next:** [Chapter 6: LLM - The Agent's Brain](06_llm.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
330
docs/CrewAI/06_llm.md
Normal file
330
docs/CrewAI/06_llm.md
Normal file
@@ -0,0 +1,330 @@
|
||||
# Chapter 6: LLM - The Agent's Brain
|
||||
|
||||
In the [previous chapter](05_process.md), we explored the `Process` - how the `Crew` organizes the workflow for its `Agent`s, deciding whether they work sequentially or are managed hierarchically. We now have specialized agents ([Agent](02_agent.md)), defined work ([Task](03_task.md)), useful abilities ([Tool](04_tool.md)), and a workflow strategy ([Process](05_process.md)).
|
||||
|
||||
But what actually does the *thinking* inside an agent? When we give the 'Travel Researcher' agent the task "Find sunny European cities," what part of the agent understands this request, decides to use the search tool, interprets the results, and writes the final list?
|
||||
|
||||
This core thinking component is the **Large Language Model**, or **LLM**.
|
||||
|
||||
## Why Do Agents Need an LLM?
|
||||
|
||||
Imagine our 'Travel Researcher' agent again. It has a `role`, `goal`, and `backstory`. It has a `Task` to complete and maybe a `Tool` to search the web. But it needs something to:
|
||||
|
||||
1. **Understand:** Read the task description, its own role/goal, and any context from previous tasks.
|
||||
2. **Reason:** Figure out a plan. "Okay, I need sunny cities. My description says I'm an expert. The task asks for 3. I should use the search tool to get current info."
|
||||
3. **Act:** Decide *when* to use a tool and *what* input to give it (e.g., formulate the search query).
|
||||
4. **Generate:** Take the information (search results, its own knowledge) and write the final output in the expected format.
|
||||
|
||||
The LLM is the engine that performs all these cognitive actions. It's the "brain" that drives the agent's behavior based on the instructions and tools provided.
|
||||
|
||||
**Problem Solved:** The LLM provides the core intelligence for each `Agent`. It processes language, makes decisions (like which tool to use or what text to generate), and ultimately enables the agent to perform its assigned `Task` based on its defined profile.
|
||||
|
||||
## What is an LLM in CrewAI?
|
||||
|
||||
Think of an LLM as a highly advanced, versatile AI assistant you can interact with using text. Models like OpenAI's GPT-4, Google's Gemini, Anthropic's Claude, or open-source models run locally via tools like Ollama are all examples of LLMs. They are trained on vast amounts of text data and can understand instructions, answer questions, write text, summarize information, and even make logical deductions.
|
||||
|
||||
In CrewAI, the `LLM` concept is an **abstraction**. CrewAI itself doesn't *include* these massive language models. Instead, it provides a standardized way to **connect to and interact with** various LLMs, whether they are hosted by companies like OpenAI or run on your own computer.
|
||||
|
||||
**How CrewAI Handles LLMs:**
|
||||
|
||||
* **`litellm` Integration:** CrewAI uses a fantastic library called `litellm` under the hood. `litellm` acts like a universal translator, allowing CrewAI to talk to over 100 different LLM providers (OpenAI, Azure OpenAI, Gemini, Anthropic, Ollama, Hugging Face, etc.) using a consistent interface. This means you can easily switch the "brain" of your agents without rewriting large parts of your code.
|
||||
* **Standard Interface:** The CrewAI `LLM` abstraction (often represented by helper classes or configuration settings) simplifies how you specify which model to use and how it should behave. It handles common parameters like:
|
||||
* `model`: The specific name of the LLM you want to use (e.g., `"gpt-4o"`, `"ollama/llama3"`, `"gemini-pro"`).
|
||||
* `temperature`: Controls the randomness (creativity) of the output. Lower values (e.g., 0.1) make the output more deterministic and focused, while higher values (e.g., 0.8) make it more creative but potentially less factual.
|
||||
* `max_tokens`: The maximum number of words (tokens) the LLM should generate in its response.
|
||||
* **API Management:** It manages the technical details of sending requests to the chosen LLM provider and receiving the responses.
|
||||
|
||||
Essentially, CrewAI lets you plug in the LLM brain of your choice for your agents.
|
||||
|
||||
## Configuring an LLM for Your Crew
|
||||
|
||||
You need to tell CrewAI which LLM(s) your agents should use. There are several ways to do this, ranging from letting CrewAI detect settings automatically to explicitly configuring specific models.
|
||||
|
||||
**1. Automatic Detection (Environment Variables)**
|
||||
|
||||
Often the easiest way for common models like OpenAI's is to set environment variables. CrewAI (via `litellm`) can pick these up automatically.
|
||||
|
||||
If you set these in your system or a `.env` file:
|
||||
|
||||
```bash
|
||||
# Example .env file
|
||||
OPENAI_API_KEY="sk-your_openai_api_key_here"
|
||||
# Optional: Specify the model, otherwise it uses a default like gpt-4o
|
||||
OPENAI_MODEL_NAME="gpt-4o"
|
||||
```
|
||||
|
||||
Then, often you don't need to specify the LLM explicitly in your code:
|
||||
|
||||
```python
|
||||
# agent.py (simplified)
|
||||
from crewai import Agent
|
||||
|
||||
# If OPENAI_API_KEY and OPENAI_MODEL_NAME are set in the environment,
|
||||
# CrewAI might automatically configure an OpenAI LLM for this agent.
|
||||
researcher = Agent(
|
||||
role='Travel Researcher',
|
||||
goal='Find interesting cities in Europe',
|
||||
backstory='Expert researcher.',
|
||||
# No 'llm=' parameter needed here if env vars are set
|
||||
)
|
||||
```
|
||||
|
||||
**2. Explicit Configuration (Recommended for Clarity)**
|
||||
|
||||
It's usually better to be explicit about which LLM you want to use. CrewAI integrates well with LangChain's LLM wrappers, which are commonly used.
|
||||
|
||||
**Example: Using OpenAI (GPT-4o)**
|
||||
|
||||
```python
|
||||
# Make sure you have langchain_openai installed: pip install langchain-openai
|
||||
import os
|
||||
from langchain_openai import ChatOpenAI
|
||||
from crewai import Agent
|
||||
|
||||
# Set the API key (best practice: use environment variables)
|
||||
# os.environ["OPENAI_API_KEY"] = "sk-your_key_here"
|
||||
|
||||
# Instantiate the OpenAI LLM wrapper
|
||||
openai_llm = ChatOpenAI(model="gpt-4o", temperature=0.7)
|
||||
|
||||
# Pass the configured LLM to the Agent
|
||||
researcher = Agent(
|
||||
role='Travel Researcher',
|
||||
goal='Find interesting cities in Europe',
|
||||
backstory='Expert researcher.',
|
||||
llm=openai_llm # Explicitly assign the LLM
|
||||
)
|
||||
|
||||
# You can also assign a default LLM to the Crew
|
||||
# from crewai import Crew
|
||||
# trip_crew = Crew(
|
||||
# agents=[researcher],
|
||||
# tasks=[...],
|
||||
# # Manager LLM for hierarchical process
|
||||
# manager_llm=openai_llm
|
||||
# # A function_calling_llm can also be set for tool use reasoning
|
||||
# # function_calling_llm=openai_llm
|
||||
# )
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We import `ChatOpenAI` from `langchain_openai`.
|
||||
* We create an instance, specifying the `model` name and optionally other parameters like `temperature`.
|
||||
* We pass this `openai_llm` object to the `llm` parameter when creating the `Agent`. This agent will now use GPT-4o for its thinking.
|
||||
* You can also assign LLMs at the `Crew` level, especially the `manager_llm` for hierarchical processes or a default `function_calling_llm` which helps agents decide *which* tool to use.
|
||||
|
||||
**Example: Using a Local Model via Ollama (Llama 3)**
|
||||
|
||||
If you have Ollama running locally with a model like Llama 3 pulled (`ollama pull llama3`):
|
||||
|
||||
```python
|
||||
# Make sure you have langchain_community installed: pip install langchain-community
|
||||
from langchain_community.llms import Ollama
|
||||
from crewai import Agent
|
||||
|
||||
# Instantiate the Ollama LLM wrapper
|
||||
# Make sure Ollama server is running!
|
||||
ollama_llm = Ollama(model="llama3", base_url="http://localhost:11434")
|
||||
# temperature, etc. can also be set if supported by the model/wrapper
|
||||
|
||||
# Pass the configured LLM to the Agent
|
||||
local_researcher = Agent(
|
||||
role='Travel Researcher',
|
||||
goal='Find interesting cities in Europe',
|
||||
backstory='Expert researcher.',
|
||||
llm=ollama_llm # Use the local Llama 3 model
|
||||
)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We import `Ollama` from `langchain_community.llms`.
|
||||
* We create an instance, specifying the `model` name ("llama3" in this case, assuming it's available in your Ollama setup) and the `base_url` where your Ollama server is running.
|
||||
* We pass `ollama_llm` to the `Agent`. Now, this agent's "brain" runs entirely on your local machine!
|
||||
|
||||
**CrewAI's `LLM` Class (Advanced/Direct `litellm` Usage)**
|
||||
|
||||
CrewAI also provides its own `LLM` class (`from crewai import LLM`) which allows more direct configuration using `litellm` parameters. This is less common for beginners than using the LangChain wrappers shown above, but offers fine-grained control.
|
||||
|
||||
**Passing LLMs to the Crew**
|
||||
|
||||
Besides assigning an LLM to each agent individually, you can set defaults or specific roles at the `Crew` level:
|
||||
|
||||
```python
|
||||
from crewai import Crew, Process
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
# Assume agents 'researcher', 'planner' and tasks 'task1', 'task2' are defined
|
||||
|
||||
openai_llm = ChatOpenAI(model="gpt-4o")
|
||||
fast_llm = ChatOpenAI(model="gpt-3.5-turbo") # Maybe a faster/cheaper model
|
||||
|
||||
trip_crew = Crew(
|
||||
agents=[researcher, planner], # Agents might have their own LLMs assigned too
|
||||
tasks=[task1, task2],
|
||||
process=Process.hierarchical,
|
||||
# The Manager agent will use gpt-4o
|
||||
manager_llm=openai_llm,
|
||||
# Use gpt-3.5-turbo specifically for deciding which tool to use (can save costs)
|
||||
function_calling_llm=fast_llm
|
||||
)
|
||||
```
|
||||
|
||||
* `manager_llm`: Specifies the brain for the manager agent in a hierarchical process.
|
||||
* `function_calling_llm`: Specifies the LLM used by agents primarily to decide *which tool to call* and *with what arguments*. This can sometimes be a faster/cheaper model than the one used for generating the final detailed response. If not set, agents typically use their main `llm`.
|
||||
|
||||
If an agent doesn't have an `llm` explicitly assigned, it might inherit the `function_calling_llm` or default to environment settings. It's usually clearest to assign LLMs explicitly where needed.
|
||||
|
||||
## How LLM Interaction Works Internally
|
||||
|
||||
When an [Agent](02_agent.md) needs to think (e.g., execute a [Task](03_task.md)), the process looks like this:
|
||||
|
||||
1. **Prompt Assembly:** The `Agent` gathers all relevant information: its `role`, `goal`, `backstory`, the `Task` description, `expected_output`, any `context` from previous tasks, and the descriptions of its available `Tool`s. It assembles this into a detailed prompt.
|
||||
2. **LLM Object Call:** The `Agent` passes this prompt to its configured `LLM` object (e.g., the `ChatOpenAI` instance or the `Ollama` instance we created).
|
||||
3. **`litellm` Invocation:** The CrewAI/LangChain `LLM` object uses `litellm`'s `completion` function, passing the assembled prompt (formatted as messages), the target `model` name, and other parameters (`temperature`, `max_tokens`, `tools`, etc.).
|
||||
4. **API Request:** `litellm` handles the specifics of communicating with the target LLM's API (e.g., sending a request to OpenAI's API endpoint or the local Ollama server).
|
||||
5. **LLM Processing:** The actual LLM (GPT-4, Llama 3, etc.) processes the request.
|
||||
6. **API Response:** The LLM provider sends back the response (which could be generated text or a decision to use a specific tool with certain arguments).
|
||||
7. **`litellm` Response Handling:** `litellm` receives the API response and standardizes it.
|
||||
8. **LLM Object Response:** The `LLM` object receives the standardized response from `litellm`.
|
||||
9. **Result to Agent:** The `LLM` object returns the result (text or tool call information) back to the `Agent`.
|
||||
10. **Agent Action:** The `Agent` then either uses the generated text as its output or, if the LLM decided to use a tool, it executes the specified tool.
|
||||
|
||||
Let's visualize this:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Agent
|
||||
participant LLM_Object as LLM Object (e.g., ChatOpenAI)
|
||||
participant LiteLLM
|
||||
participant ProviderAPI as Actual LLM API (e.g., OpenAI)
|
||||
|
||||
Agent->>Agent: Assemble Prompt (Role, Goal, Task, Tools...)
|
||||
Agent->>LLM_Object: call(prompt, tools_schema)
|
||||
LLM_Object->>LiteLLM: litellm.completion(model, messages, ...)
|
||||
LiteLLM->>ProviderAPI: Send API Request
|
||||
ProviderAPI-->>LiteLLM: Receive API Response (text or tool_call)
|
||||
LiteLLM-->>LLM_Object: Standardized Response
|
||||
LLM_Object-->>Agent: Result (text or tool_call)
|
||||
Agent->>Agent: Process Result (Output text or Execute tool)
|
||||
```
|
||||
|
||||
**Diving into the Code (`llm.py`, `utilities/llm_utils.py`)**
|
||||
|
||||
The primary logic resides in `crewai/llm.py` and the helper `crewai/utilities/llm_utils.py`.
|
||||
|
||||
* **`crewai/utilities/llm_utils.py`:** The `create_llm` function is key. It handles the logic of figuring out which LLM to instantiate based on environment variables, direct `LLM` object input, or string names. It tries to create an `LLM` instance.
|
||||
* **`crewai/llm.py`:**
|
||||
* The `LLM` class itself holds the configuration (`model`, `temperature`, etc.).
|
||||
* The `call` method is the main entry point. It takes the `messages` (the prompt) and optional `tools`.
|
||||
* It calls `_prepare_completion_params` to format the request parameters based on the LLM's requirements and the provided configuration.
|
||||
* Crucially, it then calls `litellm.completion(**params)`. This is where the magic happens – `litellm` takes over communication with the actual LLM API.
|
||||
* It handles the response from `litellm`, checking for text content or tool calls (`_handle_non_streaming_response` or `_handle_streaming_response`).
|
||||
* It uses helper methods like `_format_messages_for_provider` to deal with quirks of different LLMs (like Anthropic needing a 'user' message first).
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/llm.py
|
||||
|
||||
# Import litellm and other necessary modules
|
||||
import litellm
|
||||
from typing import List, Dict, Optional, Union, Any
|
||||
|
||||
class LLM:
|
||||
def __init__(self, model: str, temperature: Optional[float] = 0.7, **kwargs):
|
||||
self.model = model
|
||||
self.temperature = temperature
|
||||
# ... store other parameters like max_tokens, api_key, base_url ...
|
||||
self.additional_params = kwargs
|
||||
self.stream = False # Default to non-streaming
|
||||
|
||||
def _prepare_completion_params(self, messages, tools=None) -> Dict[str, Any]:
|
||||
# Formats messages based on provider (e.g., Anthropic)
|
||||
formatted_messages = self._format_messages_for_provider(messages)
|
||||
|
||||
params = {
|
||||
"model": self.model,
|
||||
"messages": formatted_messages,
|
||||
"temperature": self.temperature,
|
||||
"tools": tools,
|
||||
"stream": self.stream,
|
||||
# ... add other stored parameters (max_tokens, api_key etc.) ...
|
||||
**self.additional_params,
|
||||
}
|
||||
# Remove None values
|
||||
return {k: v for k, v in params.items() if v is not None}
|
||||
|
||||
def call(self, messages, tools=None, callbacks=None, available_functions=None) -> Union[str, Any]:
|
||||
# ... (emit start event, validate params) ...
|
||||
|
||||
try:
|
||||
# Prepare the parameters for litellm
|
||||
params = self._prepare_completion_params(messages, tools)
|
||||
|
||||
# Decide whether to stream or not (simplified here)
|
||||
if self.stream:
|
||||
# Handles chunk processing, tool calls from stream end
|
||||
return self._handle_streaming_response(params, callbacks, available_functions)
|
||||
else:
|
||||
# Makes single call, handles tool calls from response
|
||||
return self._handle_non_streaming_response(params, callbacks, available_functions)
|
||||
|
||||
except Exception as e:
|
||||
# ... (emit failure event, handle exceptions like context window exceeded) ...
|
||||
raise e
|
||||
|
||||
def _handle_non_streaming_response(self, params, callbacks, available_functions):
|
||||
# THE CORE CALL TO LITELLM
|
||||
response = litellm.completion(**params)
|
||||
|
||||
# Extract text content
|
||||
text_response = response.choices[0].message.content or ""
|
||||
|
||||
# Check for tool calls in the response
|
||||
tool_calls = getattr(response.choices[0].message, "tool_calls", [])
|
||||
|
||||
if not tool_calls or not available_functions:
|
||||
# ... (emit success event) ...
|
||||
return text_response # Return plain text
|
||||
else:
|
||||
# Handle the tool call (runs the actual function)
|
||||
tool_result = self._handle_tool_call(tool_calls, available_functions)
|
||||
if tool_result is not None:
|
||||
return tool_result # Return tool output
|
||||
else:
|
||||
# ... (emit success event for text if tool failed?) ...
|
||||
return text_response # Fallback to text if tool fails
|
||||
|
||||
def _handle_tool_call(self, tool_calls, available_functions):
|
||||
# Extracts function name and args from tool_calls[0]
|
||||
# Looks up function in available_functions
|
||||
# Executes the function with args
|
||||
# Returns the result
|
||||
# ... (error handling) ...
|
||||
pass
|
||||
|
||||
def _format_messages_for_provider(self, messages):
|
||||
# Handles provider-specific message formatting rules
|
||||
# (e.g., ensuring Anthropic starts with 'user' role)
|
||||
pass
|
||||
|
||||
# ... other methods like _handle_streaming_response ...
|
||||
```
|
||||
|
||||
This simplified view shows how the `LLM` class acts as a wrapper around `litellm`, preparing requests and processing responses, shielding the rest of CrewAI from the complexities of different LLM APIs.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned about the **LLM**, the essential "brain" powering your CrewAI [Agent](02_agent.md)s. It's the component that understands language, reasons about tasks, decides on actions (like using [Tool](04_tool.md)s), and generates text.
|
||||
|
||||
We saw that CrewAI uses the `litellm` library to provide a flexible way to connect to a wide variety of LLM providers (like OpenAI, Google Gemini, Anthropic Claude, or local models via Ollama). You can configure which LLM your agents or crew use, either implicitly through environment variables or explicitly by passing configured LLM objects (often using LangChain wrappers) during `Agent` or `Crew` creation.
|
||||
|
||||
This abstraction makes CrewAI powerful, allowing you to experiment with different models to find the best fit for your specific needs and budget.
|
||||
|
||||
But sometimes, agents need to remember things from past interactions or previous tasks within the same run. How does CrewAI handle short-term and potentially long-term memory? Let's explore that in the next chapter!
|
||||
|
||||
**Next:** [Chapter 7: Memory - Giving Agents Recall](07_memory.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
216
docs/CrewAI/07_memory.md
Normal file
216
docs/CrewAI/07_memory.md
Normal file
@@ -0,0 +1,216 @@
|
||||
# Chapter 7: Memory - Giving Your Crew Recall
|
||||
|
||||
In the [previous chapter](06_llm.md), we looked at the Large Language Model ([LLM](06_llm.md)) – the "brain" that allows each [Agent](02_agent.md) to understand, reason, and generate text. Now we have agents that can think, perform [Task](03_task.md)s using [Tool](04_tool.md)s, and follow a [Process](05_process.md).
|
||||
|
||||
But imagine a team working on a complex project over several days. What if every morning, they completely forgot everything they discussed and learned the previous day? They'd waste a lot of time repeating work and asking the same questions. By default, AI agents often behave like this – they only remember the immediate conversation.
|
||||
|
||||
How can we give our CrewAI team the ability to remember past information? That's where **Memory** comes in!
|
||||
|
||||
## Why Do We Need Memory?
|
||||
|
||||
AI Agents, especially when working together in a [Crew](01_crew.md), often need to build upon previous interactions or knowledge gained during their work. Without memory:
|
||||
|
||||
* An agent might ask for the same information multiple times.
|
||||
* Context from an earlier task might be lost by the time a later task runs.
|
||||
* The crew can't easily learn from past experiences across different projects or runs.
|
||||
* Tracking specific details about key people, places, or concepts mentioned during the process becomes difficult.
|
||||
|
||||
**Problem Solved:** Memory provides [Agent](02_agent.md)s and the [Crew](01_crew.md) with the ability to store and recall past interactions, information, and insights. It's like giving your AI team shared notes, a collective memory, or institutional knowledge.
|
||||
|
||||
## What is Memory in CrewAI?
|
||||
|
||||
Think of Memory as the **storage system** for your Crew's experiences and knowledge. It allows the Crew to persist information beyond a single interaction or task execution. CrewAI implements different kinds of memory to handle different needs:
|
||||
|
||||
1. **`ShortTermMemory`**:
|
||||
* **Analogy:** Like your computer's RAM or a person's short-term working memory.
|
||||
* **Purpose:** Holds immediate context and information relevant *within the current run* of the Crew. What happened in the previous task? What was just discussed?
|
||||
* **How it helps:** Ensures that the output of one task is available and easily accessible as context for the next task within the same `kickoff()` execution. It helps maintain the flow of conversation and information *during* a single job.
|
||||
|
||||
2. **`LongTermMemory`**:
|
||||
* **Analogy:** Like a team's documented "lessons learned" database or a long-term knowledge base.
|
||||
* **Purpose:** Stores insights, evaluations, and key takeaways *across multiple runs* of the Crew. Did a similar task succeed or fail in the past? What strategies worked well?
|
||||
* **How it helps:** Allows the Crew to improve over time by recalling past performance on similar tasks. (Note: Effective use often involves evaluating task outcomes, which can be an advanced topic).
|
||||
|
||||
3. **`EntityMemory`**:
|
||||
* **Analogy:** Like a CRM (Customer Relationship Management) system, a character sheet in a game, or index cards about important topics.
|
||||
* **Purpose:** Tracks specific entities (like people, companies, projects, concepts) mentioned during the Crew's execution and stores details and relationships about them. Who is "Dr. Evans"? What is "Project Phoenix"?
|
||||
* **How it helps:** Maintains consistency and detailed knowledge about key subjects, preventing the Crew from forgetting important details about who or what it's dealing with.
|
||||
|
||||
## How Does Memory Help?
|
||||
|
||||
Using memory makes your Crew more effective:
|
||||
|
||||
* **Better Context:** Agents have access to relevant past information, leading to more informed decisions and responses.
|
||||
* **Efficiency:** Avoids redundant questions and re-work by recalling previously established facts or results.
|
||||
* **Learning (LTM):** Enables the Crew to get better over time based on past performance.
|
||||
* **Consistency (Entity):** Keeps track of important details about recurring topics or entities.
|
||||
* **Shared Understanding:** Helps create a common ground of knowledge for all agents in the Crew.
|
||||
|
||||
## Using Memory in Your Crew
|
||||
|
||||
The simplest way to start using memory is by enabling it when you define your `Crew`. Setting `memory=True` activates the core memory components (ShortTerm and Entity Memory) for context building within a run.
|
||||
|
||||
Let's add memory to our trip planning `Crew`:
|
||||
|
||||
```python
|
||||
# Assuming 'researcher' and 'planner' agents are defined (Chapter 2)
|
||||
# Assuming 'task1' and 'task2' are defined (Chapter 3)
|
||||
# Assuming an LLM is configured (Chapter 6)
|
||||
|
||||
from crewai import Crew, Process
|
||||
|
||||
# researcher = Agent(...)
|
||||
# planner = Agent(...)
|
||||
# task1 = Task(...)
|
||||
# task2 = Task(...)
|
||||
|
||||
# Define the crew WITH memory enabled
|
||||
trip_crew_with_memory = Crew(
|
||||
agents=[researcher, planner],
|
||||
tasks=[task1, task2],
|
||||
process=Process.sequential,
|
||||
memory=True # <-- Enable memory features!
|
||||
# verbose=2
|
||||
)
|
||||
|
||||
# Start the work. Agents will now leverage memory.
|
||||
# result = trip_crew_with_memory.kickoff()
|
||||
# print(result)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We simply add the `memory=True` parameter when creating the `Crew`.
|
||||
* **What does this do?** Behind the scenes, CrewAI initializes `ShortTermMemory` and `EntityMemory` for this crew.
|
||||
* **How is it used?**
|
||||
* **ShortTermMemory:** As tasks complete within this `kickoff()` run, their outputs and key interactions can be stored. When the next task starts, CrewAI automatically queries this memory for relevant recent context to add to the prompt for the next agent. This makes the context flow smoother than just passing the raw output of the previous task.
|
||||
* **EntityMemory:** As agents discuss entities (e.g., "Lisbon," "May birthday trip"), the memory tries to capture details about them. If "Lisbon" is mentioned again later, the memory can provide the stored details ("Coastal city, known for trams and Fado music...") as context.
|
||||
* **LongTermMemory:** While `memory=True` sets up the *potential* for LTM, actively using it to learn across multiple runs often requires additional steps like task evaluation or explicit saving mechanisms, which are more advanced topics beyond this basic introduction. For now, focus on the benefits of STM and Entity Memory for within-run context.
|
||||
|
||||
By just adding `memory=True`, your agents automatically get better at remembering what's going on *within the current job*.
|
||||
|
||||
## How Memory Works Internally (Simplified)
|
||||
|
||||
So, what happens "under the hood" when `memory=True` and an agent starts a task?
|
||||
|
||||
1. **Task Execution Start:** The [Crew](01_crew.md) assigns a [Task](03_task.md) to an [Agent](02_agent.md).
|
||||
2. **Context Gathering:** Before calling the [LLM](06_llm.md), the Crew interacts with its **Memory Module** (specifically, the `ContextualMemory` orchestrator). It asks, "What relevant memories do we have for this task, considering the description and any immediate context?"
|
||||
3. **Memory Module Queries:** The `ContextualMemory` then queries the different active memory types:
|
||||
* It asks `ShortTermMemory`: "Show me recent interactions or results related to this query." (Uses RAG/vector search on recent data).
|
||||
* It asks `EntityMemory`: "Tell me about entities mentioned in this query." (Uses RAG/vector search on stored entity data).
|
||||
* *If LTM were being actively queried (less common automatically):* "Any long-term insights related to this type of task?" (Usually queries a database like SQLite).
|
||||
4. **Context Consolidation:** The Memory Module gathers the relevant snippets from each memory type.
|
||||
5. **Prompt Augmentation:** This retrieved memory context is combined with the original task description, expected output, and any direct context (like the previous task's raw output).
|
||||
6. **LLM Call:** This augmented, richer prompt is sent to the agent's [LLM](06_llm.md).
|
||||
7. **Agent Response:** The agent generates its response, now informed by the retrieved memories.
|
||||
8. **Memory Update:** As the task completes, its key interactions and outputs are processed and potentially saved back into ShortTermMemory and EntityMemory for future use within this run.
|
||||
|
||||
Let's visualize this context-building flow:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant C as Crew
|
||||
participant A as Agent
|
||||
participant CtxMem as ContextualMemory
|
||||
participant STM as ShortTermMemory
|
||||
participant EM as EntityMemory
|
||||
participant LLM as Agent's LLM
|
||||
|
||||
C->>A: Execute Task(description, current_context)
|
||||
Note over A: Need to build full prompt context.
|
||||
A->>CtxMem: Get memory context for task query
|
||||
CtxMem->>STM: Search(task_query)
|
||||
STM-->>CtxMem: Recent memories (e.g., "Found Lisbon earlier")
|
||||
CtxMem->>EM: Search(task_query)
|
||||
EM-->>CtxMem: Entity details (e.g., "Lisbon: Capital of Portugal")
|
||||
CtxMem-->>A: Combined Memory Snippets
|
||||
A->>A: Assemble Final Prompt (Task Desc + Current Context + Memory Snippets)
|
||||
A->>LLM: Process Augmented Prompt
|
||||
LLM-->>A: Generate Response
|
||||
A-->>C: Task Result
|
||||
Note over C: Crew updates memories (STM, EM) with task results.
|
||||
|
||||
```
|
||||
|
||||
**Diving into the Code (High Level)**
|
||||
|
||||
* **`crewai/crew.py`:** When you set `memory=True` in the `Crew` constructor, the `create_crew_memory` validator method (triggered by Pydantic) initializes instances of `ShortTermMemory`, `LongTermMemory`, and `EntityMemory` and stores them in private attributes like `_short_term_memory`.
|
||||
|
||||
```python
|
||||
# Simplified from crewai/crew.py
|
||||
class Crew(BaseModel):
|
||||
memory: bool = Field(default=False, ...)
|
||||
_short_term_memory: Optional[InstanceOf[ShortTermMemory]] = PrivateAttr()
|
||||
_long_term_memory: Optional[InstanceOf[LongTermMemory]] = PrivateAttr()
|
||||
_entity_memory: Optional[InstanceOf[EntityMemory]] = PrivateAttr()
|
||||
# ... other fields ...
|
||||
|
||||
@model_validator(mode="after")
|
||||
def create_crew_memory(self) -> "Crew":
|
||||
if self.memory:
|
||||
# Simplified: Initializes memory objects if memory=True
|
||||
self._long_term_memory = LongTermMemory(...)
|
||||
self._short_term_memory = ShortTermMemory(crew=self, ...)
|
||||
self._entity_memory = EntityMemory(crew=self, ...)
|
||||
return self
|
||||
```
|
||||
|
||||
* **`crewai/memory/contextual/contextual_memory.py`:** This class is responsible for orchestrating the retrieval from different memory types. Its `build_context_for_task` method takes the task information and queries the relevant memories.
|
||||
|
||||
```python
|
||||
# Simplified from crewai/memory/contextual/contextual_memory.py
|
||||
class ContextualMemory:
|
||||
def __init__(self, stm: ShortTermMemory, ltm: LongTermMemory, em: EntityMemory, ...):
|
||||
self.stm = stm
|
||||
self.ltm = ltm
|
||||
self.em = em
|
||||
# ...
|
||||
|
||||
def build_context_for_task(self, task, context) -> str:
|
||||
query = f"{task.description} {context}".strip()
|
||||
if not query: return ""
|
||||
|
||||
memory_context = []
|
||||
# Fetch relevant info from Short Term Memory
|
||||
memory_context.append(self._fetch_stm_context(query))
|
||||
# Fetch relevant info from Entity Memory
|
||||
memory_context.append(self._fetch_entity_context(query))
|
||||
# Fetch relevant info from Long Term Memory (if applicable)
|
||||
# memory_context.append(self._fetch_ltm_context(task.description))
|
||||
|
||||
return "\n".join(filter(None, memory_context))
|
||||
|
||||
def _fetch_stm_context(self, query) -> str:
|
||||
stm_results = self.stm.search(query)
|
||||
# ... format results ...
|
||||
return formatted_results if stm_results else ""
|
||||
|
||||
def _fetch_entity_context(self, query) -> str:
|
||||
em_results = self.em.search(query)
|
||||
# ... format results ...
|
||||
return formatted_results if em_results else ""
|
||||
```
|
||||
|
||||
* **Memory Types (`short_term_memory.py`, `entity_memory.py`, `long_term_memory.py`):**
|
||||
* `ShortTermMemory` and `EntityMemory` typically use `RAGStorage` (`crewai/memory/storage/rag_storage.py`), which often relies on a vector database like ChromaDB to store embeddings of text snippets and find similar ones based on a query.
|
||||
* `LongTermMemory` typically uses `LTMSQLiteStorage` (`crewai/memory/storage/ltm_sqlite_storage.py`) to save structured data about task evaluations (like descriptions, scores, suggestions) into an SQLite database file.
|
||||
|
||||
The key idea is that `memory=True` sets up these storage systems and the `ContextualMemory` orchestrator, which automatically enriches agent prompts with relevant remembered information.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned about the crucial concept of **Memory** in CrewAI! Memory gives your agents the ability to recall past information, preventing them from being purely stateless. We explored the three main types:
|
||||
|
||||
* **`ShortTermMemory`**: For context within the current run.
|
||||
* **`LongTermMemory`**: For insights across multiple runs (more advanced).
|
||||
* **`EntityMemory`**: For tracking specific people, places, or concepts.
|
||||
|
||||
Enabling memory with `memory=True` in your `Crew` is the first step to making your agents more context-aware and efficient, primarily leveraging Short Term and Entity memory automatically.
|
||||
|
||||
But what if your agents need access to a large body of pre-existing information, like company documentation, technical manuals, or a specific set of research papers? That's static information, not necessarily memories of *interactions*. How do we provide that? That's where the concept of **Knowledge** comes in. Let's explore that next!
|
||||
|
||||
**Next:** [Chapter 8: Knowledge - Providing External Information](08_knowledge.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
266
docs/CrewAI/08_knowledge.md
Normal file
266
docs/CrewAI/08_knowledge.md
Normal file
@@ -0,0 +1,266 @@
|
||||
# Chapter 8: Knowledge - Providing External Information
|
||||
|
||||
In [Chapter 7: Memory](07_memory.md), we learned how to give our [Crew](01_crew.md) the ability to remember past interactions and details using `Memory`. This helps them maintain context within a single run and potentially across runs.
|
||||
|
||||
But what if your [Agent](02_agent.md) needs access to a large body of *existing* information that isn't derived from its own conversations? Think about company documents, technical manuals, specific research papers, or a product catalog. This information exists *before* the Crew starts working. How do we give our agents access to this specific library of information?
|
||||
|
||||
That's where **`Knowledge`** comes in!
|
||||
|
||||
## Why Do We Need Knowledge?
|
||||
|
||||
Imagine you have an [Agent](02_agent.md) whose job is to answer customer questions about a specific product, "Widget Pro". You want this agent to *only* use the official "Widget Pro User Manual" to answer questions, not its general knowledge from the internet (which might be outdated or wrong).
|
||||
|
||||
Without a way to provide the manual, the agent might hallucinate answers or use incorrect information. `Knowledge` allows us to load specific documents (like the user manual), process them, and make them searchable for our agents.
|
||||
|
||||
**Problem Solved:** `Knowledge` provides your [Agent](02_agent.md)s with access to specific, pre-defined external information sources (like documents or databases), allowing them to retrieve relevant context to enhance their understanding and task execution based on that specific information.
|
||||
|
||||
## What is Knowledge?
|
||||
|
||||
Think of `Knowledge` as giving your [Crew](01_crew.md) access to a **specialized, private library** full of specific documents or information. It consists of a few key parts:
|
||||
|
||||
1. **`KnowledgeSource`**: This represents the actual *source* of the information. It could be:
|
||||
* A local file (PDF, DOCX, TXT, etc.)
|
||||
* A website URL
|
||||
* A database connection (more advanced)
|
||||
CrewAI uses helpful classes like `CrewDoclingSource` to easily handle various file types and web content. You tell the `KnowledgeSource` *where* the information is (e.g., the file path to your user manual).
|
||||
|
||||
2. **Processing & Embedding**: When you create a `Knowledge` object with sources, the information is automatically:
|
||||
* **Loaded**: The content is read from the source (e.g., text extracted from the PDF).
|
||||
* **Chunked**: The long text is broken down into smaller, manageable pieces (chunks).
|
||||
* **Embedded**: Each chunk is converted into a numerical representation (an embedding vector) that captures its meaning. This is done using an embedding model (often specified via the `embedder` configuration).
|
||||
|
||||
3. **`KnowledgeStorage` (Vector Database)**: These embedded chunks are then stored in a special kind of database called a vector database. CrewAI typically uses **ChromaDB** by default for this.
|
||||
* **Why?** Vector databases are optimized for finding information based on *semantic similarity*. When an agent asks a question related to a topic, the database can quickly find the text chunks whose meanings (embeddings) are closest to the meaning of the question.
|
||||
|
||||
4. **Retrieval**: When an [Agent](02_agent.md) needs information for its [Task](03_task.md), it queries the `Knowledge` object. This query is also embedded, and the `KnowledgeStorage` efficiently retrieves the most relevant text chunks from the original documents. These chunks are then provided to the agent as context.
|
||||
|
||||
In short: `Knowledge` = Specific Info Sources + Processing/Embedding + Vector Storage + Retrieval.
|
||||
|
||||
## Using Knowledge in Your Crew
|
||||
|
||||
Let's give our 'Product Support Agent' access to a hypothetical "widget_pro_manual.txt" file.
|
||||
|
||||
**1. Prepare Your Knowledge Source File:**
|
||||
|
||||
Make sure you have a directory named `knowledge` in your project's root folder. Place your file (e.g., `widget_pro_manual.txt`) inside this directory.
|
||||
|
||||
```
|
||||
your_project_root/
|
||||
├── knowledge/
|
||||
│ └── widget_pro_manual.txt
|
||||
└── your_crewai_script.py
|
||||
```
|
||||
|
||||
*(Make sure `widget_pro_manual.txt` contains some text about Widget Pro.)*
|
||||
|
||||
**2. Define the Knowledge Source and Knowledge Object:**
|
||||
|
||||
```python
|
||||
# Make sure you have docling installed for file handling: pip install docling
|
||||
from crewai import Agent, Task, Crew, Process, Knowledge
|
||||
from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
|
||||
# Assume an LLM is configured (e.g., via environment variables or passed to Agent/Crew)
|
||||
# from langchain_openai import ChatOpenAI
|
||||
|
||||
# Define the knowledge source - point to the file inside the 'knowledge' directory
|
||||
# Use the relative path from within the 'knowledge' directory
|
||||
manual_source = CrewDoclingSource(file_paths=["widget_pro_manual.txt"])
|
||||
|
||||
# Create the Knowledge object, give it a name and pass the sources
|
||||
# This will load, chunk, embed, and store the manual's content
|
||||
product_knowledge = Knowledge(
|
||||
collection_name="widget_pro_manual", # Name for the storage collection
|
||||
sources=[manual_source],
|
||||
# embedder=... # Optional: specify embedding config, otherwise uses default
|
||||
# storage=... # Optional: specify storage config, otherwise uses default ChromaDB
|
||||
)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* We import `Knowledge` and `CrewDoclingSource`.
|
||||
* `CrewDoclingSource(file_paths=["widget_pro_manual.txt"])`: We create a source pointing to our file. Note: The path is relative *within* the `knowledge` directory. `CrewDoclingSource` handles loading various file types.
|
||||
* `Knowledge(collection_name="widget_pro_manual", sources=[manual_source])`: We create the main `Knowledge` object.
|
||||
* `collection_name`: A unique name for this set of knowledge in the vector database.
|
||||
* `sources`: A list containing the `manual_source` we defined.
|
||||
* When this line runs, CrewAI automatically processes `widget_pro_manual.txt` and stores it in the vector database under the collection "widget\_pro\_manual".
|
||||
|
||||
**3. Equip an Agent with Knowledge:**
|
||||
|
||||
You can add the `Knowledge` object directly to an agent.
|
||||
|
||||
```python
|
||||
# Define the agent and give it the knowledge base
|
||||
support_agent = Agent(
|
||||
role='Product Support Specialist',
|
||||
goal='Answer customer questions accurately based ONLY on the Widget Pro manual.',
|
||||
backstory='You are an expert support agent with deep knowledge of the Widget Pro, derived exclusively from its official manual.',
|
||||
knowledge=product_knowledge, # <-- Assign the knowledge here!
|
||||
verbose=True,
|
||||
allow_delegation=False,
|
||||
# llm=ChatOpenAI(model="gpt-4") # Example LLM
|
||||
)
|
||||
|
||||
# Define a task for the agent
|
||||
support_task = Task(
|
||||
description="The customer asks: 'How do I reset my Widget Pro?' Use the manual to find the answer.",
|
||||
expected_output="A clear, step-by-step answer based solely on the provided manual content.",
|
||||
agent=support_agent
|
||||
)
|
||||
|
||||
# Create and run the crew
|
||||
support_crew = Crew(
|
||||
agents=[support_agent],
|
||||
tasks=[support_task],
|
||||
process=Process.sequential
|
||||
)
|
||||
|
||||
# result = support_crew.kickoff()
|
||||
# print(result)
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
|
||||
* When defining `support_agent`, we pass our `product_knowledge` object to the `knowledge` parameter: `knowledge=product_knowledge`.
|
||||
* Now, whenever `support_agent` works on a `Task`, it will automatically query the `product_knowledge` base for relevant information *before* calling its [LLM](06_llm.md).
|
||||
* The retrieved text chunks from `widget_pro_manual.txt` will be added to the context given to the [LLM](06_llm.md), strongly guiding it to answer based on the manual.
|
||||
|
||||
**Expected Outcome (Conceptual):**
|
||||
|
||||
When `support_crew.kickoff()` runs:
|
||||
|
||||
1. `support_agent` receives `support_task`.
|
||||
2. The agent (internally) queries `product_knowledge` with something like "How do I reset my Widget Pro?".
|
||||
3. The vector database finds chunks from `widget_pro_manual.txt` that are semantically similar (e.g., sections describing the reset procedure).
|
||||
4. These relevant text chunks are retrieved.
|
||||
5. The agent's [LLM](06_llm.md) receives the task description *plus* the retrieved manual excerpts as context.
|
||||
6. The [LLM](06_llm.md) generates the answer based heavily on the provided manual text.
|
||||
7. The final `result` will be the step-by-step reset instructions derived from the manual.
|
||||
|
||||
*(Alternatively, you can assign `Knowledge` at the `Crew` level using the `knowledge` parameter, making it available to all agents in the crew.)*
|
||||
|
||||
## How Knowledge Retrieval Works Internally
|
||||
|
||||
When an [Agent](02_agent.md) with assigned `Knowledge` executes a [Task](03_task.md):
|
||||
|
||||
1. **Task Start:** The agent begins processing the task.
|
||||
2. **Context Building:** The agent prepares the information needed for its [LLM](06_llm.md). This includes the task description, its role/goal/backstory, and any context from `Memory` (if enabled).
|
||||
3. **Knowledge Query:** The agent identifies the need for information related to the task. It formulates a query (often based on the task description or key terms) and sends it to its assigned `Knowledge` object.
|
||||
4. **Storage Search:** The `Knowledge` object passes the query to its underlying `KnowledgeStorage` (the vector database, e.g., ChromaDB).
|
||||
5. **Vector Similarity Search:** The vector database converts the query into an embedding and searches for stored text chunks whose embeddings are closest (most similar) to the query embedding.
|
||||
6. **Retrieve Chunks:** The database returns the top N most relevant text chunks (along with metadata and scores).
|
||||
7. **Augment Prompt:** The agent takes these retrieved text chunks and adds them as specific context to the prompt it's preparing for the [LLM](06_llm.md). The prompt might now look something like: "Your task is: [...task description...]. Here is relevant information from the knowledge base: [...retrieved chunk 1...] [...retrieved chunk 2...] Now, provide the final answer."
|
||||
8. **LLM Call:** The agent sends this augmented prompt to its [LLM](06_llm.md).
|
||||
9. **Generate Response:** The [LLM](06_llm.md), now equipped with highly relevant context directly from the specified knowledge source, generates a more accurate and grounded response.
|
||||
|
||||
Let's visualize this retrieval process:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Agent
|
||||
participant K as Knowledge Object
|
||||
participant KS as KnowledgeStorage (Vector DB)
|
||||
participant LLM as Agent's LLM
|
||||
|
||||
A->>A: Start Task ('How to reset Widget Pro?')
|
||||
A->>A: Prepare base prompt (Task, Role, Goal...)
|
||||
A->>K: Query('How to reset Widget Pro?')
|
||||
K->>KS: Search(query='How to reset Widget Pro?')
|
||||
Note right of KS: Finds similar chunks via embeddings
|
||||
KS-->>K: Return relevant chunks from manual
|
||||
K-->>A: Provide relevant chunks
|
||||
A->>A: Augment prompt with retrieved chunks
|
||||
A->>LLM: Send augmented prompt
|
||||
LLM-->>A: Generate answer based on task + manual excerpts
|
||||
A->>A: Final Answer (Steps from manual)
|
||||
```
|
||||
|
||||
## Diving into the Code (High Level)
|
||||
|
||||
* **`crewai/knowledge/knowledge.py`**:
|
||||
* The `Knowledge` class holds the list of `sources` and the `storage` object.
|
||||
* Its `__init__` method initializes the `KnowledgeStorage` (creating a default ChromaDB instance if none is provided) and then iterates through the `sources`, telling each one to `add()` its content to the storage.
|
||||
* The `query()` method simply delegates the search request to the `self.storage.search()` method.
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/knowledge/knowledge.py
|
||||
class Knowledge(BaseModel):
|
||||
sources: List[BaseKnowledgeSource] = Field(default_factory=list)
|
||||
storage: Optional[KnowledgeStorage] = Field(default=None)
|
||||
embedder: Optional[Dict[str, Any]] = None
|
||||
collection_name: Optional[str] = None
|
||||
|
||||
def __init__(self, collection_name: str, sources: List[BaseKnowledgeSource], ...):
|
||||
# ... setup storage (e.g., KnowledgeStorage(...)) ...
|
||||
self.sources = sources
|
||||
self.storage.initialize_knowledge_storage()
|
||||
self._add_sources() # Tell sources to load/chunk/embed/save
|
||||
|
||||
def query(self, query: List[str], limit: int = 3) -> List[Dict[str, Any]]:
|
||||
if self.storage is None: raise ValueError("Storage not initialized.")
|
||||
# Delegate search to the storage object
|
||||
return self.storage.search(query, limit)
|
||||
|
||||
def _add_sources(self):
|
||||
for source in self.sources:
|
||||
source.storage = self.storage # Give source access to storage
|
||||
source.add() # Source loads, chunks, embeds, and saves
|
||||
```
|
||||
|
||||
* **`crewai/knowledge/source/`**: Contains different `KnowledgeSource` implementations.
|
||||
* `base_knowledge_source.py`: Defines the `BaseKnowledgeSource` abstract class, including the `add()` method placeholder and helper methods like `_chunk_text()`.
|
||||
* `crew_docling_source.py`: Implements loading from files and URLs using the `docling` library. Its `add()` method loads content, chunks it, and calls `self._save_documents()`.
|
||||
* `_save_documents()` (in `base_knowledge_source.py` or subclasses) typically calls `self.storage.save(self.chunks)`.
|
||||
|
||||
* **`crewai/knowledge/storage/knowledge_storage.py`**:
|
||||
* The `KnowledgeStorage` class acts as a wrapper around the actual vector database (ChromaDB by default).
|
||||
* `initialize_knowledge_storage()`: Sets up the connection to ChromaDB and gets/creates the specified collection.
|
||||
* `save()`: Takes the text chunks, gets their embeddings using the configured `embedder`, and `upsert`s them into the ChromaDB collection.
|
||||
* `search()`: Takes a query, gets its embedding, and uses the ChromaDB collection's `query()` method to find and return similar documents.
|
||||
|
||||
* **`crewai/agent.py`**:
|
||||
* The `Agent` class has an optional `knowledge: Knowledge` attribute.
|
||||
* In the `execute_task` method, before calling the LLM, if `self.knowledge` exists, it calls `self.knowledge.query()` using the task prompt (or parts of it) as the query.
|
||||
* The results from `knowledge.query()` are formatted and added to the task prompt as additional context.
|
||||
|
||||
```python
|
||||
# Simplified view from crewai/agent.py
|
||||
class Agent(BaseAgent):
|
||||
knowledge: Optional[Knowledge] = Field(default=None, ...)
|
||||
# ... other fields ...
|
||||
|
||||
def execute_task(self, task: Task, context: Optional[str] = None, ...) -> str:
|
||||
task_prompt = task.prompt()
|
||||
# ... add memory context if applicable ...
|
||||
|
||||
# === KNOWLEDGE RETRIEVAL ===
|
||||
if self.knowledge:
|
||||
# Query the knowledge base using the task prompt
|
||||
agent_knowledge_snippets = self.knowledge.query([task_prompt]) # Or task.description
|
||||
if agent_knowledge_snippets:
|
||||
# Format the snippets into context string
|
||||
agent_knowledge_context = extract_knowledge_context(agent_knowledge_snippets)
|
||||
if agent_knowledge_context:
|
||||
# Add knowledge context to the prompt
|
||||
task_prompt += agent_knowledge_context
|
||||
# ===========================
|
||||
|
||||
# ... add crew knowledge context if applicable ...
|
||||
# ... prepare tools, create agent_executor ...
|
||||
|
||||
# Call the LLM via agent_executor with the augmented task_prompt
|
||||
result = self.agent_executor.invoke({"input": task_prompt, ...})["output"]
|
||||
return result
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've now learned about **`Knowledge`** in CrewAI! It's the mechanism for providing your agents with access to specific, pre-existing external information sources like documents or websites. By defining `KnowledgeSource`s, creating a `Knowledge` object, and assigning it to an [Agent](02_agent.md) or [Crew](01_crew.md), you enable your agents to retrieve relevant context from these sources using vector search. This makes their responses more accurate, grounded, and aligned with the specific information you provide, distinct from the general interaction history managed by [Memory](07_memory.md).
|
||||
|
||||
This concludes our introductory tour of the core concepts in CrewAI! You've learned about managing the team ([Crew](01_crew.md)), defining specialized workers ([Agent](02_agent.md)), assigning work ([Task](03_task.md)), equipping agents with abilities ([Tool](04_tool.md)), setting the workflow ([Process](05_process.md)), powering the agent's thinking ([LLM](06_llm.md)), giving them recall ([Memory](07_memory.md)), and providing external information ([Knowledge](08_knowledge.md)).
|
||||
|
||||
With these building blocks, you're ready to start creating sophisticated AI crews to tackle complex challenges! Happy building!
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
46
docs/CrewAI/index.md
Normal file
46
docs/CrewAI/index.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# Tutorial: CrewAI
|
||||
|
||||
**CrewAI** is a framework for orchestrating *autonomous AI agents*.
|
||||
Think of it like building a specialized team (a **Crew**) where each member (**Agent**) has a role, goal, and tools.
|
||||
You assign **Tasks** to Agents, defining what needs to be done. The **Crew** manages how these Agents collaborate, following a specific **Process** (like sequential steps).
|
||||
Agents use their "brain" (an **LLM**) and can utilize **Tools** (like web search) and access shared **Memory** or external **Knowledge** bases to complete their tasks effectively.
|
||||
|
||||
|
||||
**Source Repository:** [https://github.com/crewAIInc/crewAI/tree/e723e5ca3fb7e4cb890c4befda47746aedbd7408/src/crewai](https://github.com/crewAIInc/crewAI/tree/e723e5ca3fb7e4cb890c4befda47746aedbd7408/src/crewai)
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A0["Agent"]
|
||||
A1["Task"]
|
||||
A2["Crew"]
|
||||
A3["Tool"]
|
||||
A4["Process"]
|
||||
A5["LLM"]
|
||||
A6["Memory"]
|
||||
A7["Knowledge"]
|
||||
A2 -- "Manages" --> A0
|
||||
A2 -- "Orchestrates" --> A1
|
||||
A2 -- "Defines workflow" --> A4
|
||||
A2 -- "Manages shared" --> A6
|
||||
A0 -- "Executes" --> A1
|
||||
A0 -- "Uses" --> A3
|
||||
A0 -- "Uses as brain" --> A5
|
||||
A0 -- "Queries" --> A7
|
||||
A1 -- "Assigned to" --> A0
|
||||
```
|
||||
|
||||
## Chapters
|
||||
|
||||
1. [Crew](01_crew.md)
|
||||
2. [Agent](02_agent.md)
|
||||
3. [Task](03_task.md)
|
||||
4. [Tool](04_tool.md)
|
||||
5. [Process](05_process.md)
|
||||
6. [LLM](06_llm.md)
|
||||
7. [Memory](07_memory.md)
|
||||
8. [Knowledge](08_knowledge.md)
|
||||
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
Reference in New Issue
Block a user