mirror of
https://github.com/aljazceru/Tutorial-Codebase-Knowledge.git
synced 2026-02-03 05:34:30 +01:00
update nav
This commit is contained in:
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "Agent"
|
||||
parent: "Browser Use"
|
||||
nav_order: 1
|
||||
---
|
||||
|
||||
# Chapter 1: The Agent - Your Browser Assistant's Brain
|
||||
|
||||
Welcome to the `Browser Use` tutorial! We're excited to help you learn how to automate web tasks using the power of Large Language Models (LLMs).
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "System Prompt"
|
||||
parent: "Browser Use"
|
||||
nav_order: 2
|
||||
---
|
||||
|
||||
# Chapter 2: The System Prompt - Setting the Rules for Your AI Assistant
|
||||
|
||||
In [Chapter 1: The Agent](01_agent.md), we met the `Agent`, our project manager for automating browser tasks. We saw it consults a Large Language Model (LLM) – the "planner" – to decide the next steps based on the current state of the webpage. But how does the Agent tell the LLM *how* it should think, behave, and respond? Just giving it the task isn't enough!
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "BrowserContext"
|
||||
parent: "Browser Use"
|
||||
nav_order: 3
|
||||
---
|
||||
|
||||
# Chapter 3: BrowserContext - The Agent's Isolated Workspace
|
||||
|
||||
In the [previous chapter](02_system_prompt.md), we learned how the `System Prompt` acts as the rulebook for the AI assistant (LLM) that guides our `Agent`. We know the Agent uses the LLM to decide *what* to do next based on the current situation in the browser.
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "DOM Representation"
|
||||
parent: "Browser Use"
|
||||
nav_order: 4
|
||||
---
|
||||
|
||||
# Chapter 4: DOM Representation - Mapping the Webpage
|
||||
|
||||
In the [previous chapter](03_browsercontext.md), we learned about the `BrowserContext`, the Agent's private workspace for browsing. We saw that the Agent uses `browser_context.get_state()` to get a snapshot of the current webpage. But how does the Agent actually *understand* the content of that snapshot?
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "Action Controller & Registry"
|
||||
parent: "Browser Use"
|
||||
nav_order: 5
|
||||
---
|
||||
|
||||
# Chapter 5: Action Controller & Registry - The Agent's Hands and Toolbox
|
||||
|
||||
In the [previous chapter](04_dom_representation.md), we saw how the `DomService` creates a simplified map (`DOMState`) of the webpage, allowing the Agent and its LLM planner to identify interactive elements like buttons and input fields using unique numbers (`highlight_index`). The LLM uses this map to decide *what* specific action to take next, like "click element [5]" or "type 'hello world' into element [12]".
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "Message Manager"
|
||||
parent: "Browser Use"
|
||||
nav_order: 6
|
||||
---
|
||||
|
||||
# Chapter 6: Message Manager - Keeping the Conversation Straight
|
||||
|
||||
In the [previous chapter](05_action_controller___registry.md), we learned how the `Action Controller` and `Registry` act as the Agent's "hands" and "toolbox", executing the specific actions decided by the LLM planner. But how does the LLM get all the information it needs to make those decisions in the first place? How does the Agent keep track of the ongoing conversation, including what it "saw" on the page and what happened after each action?
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "Data Structures (Views)"
|
||||
parent: "Browser Use"
|
||||
nav_order: 7
|
||||
---
|
||||
|
||||
# Chapter 7: Data Structures (Views) - The Project's Blueprints
|
||||
|
||||
In the [previous chapter](06_message_manager.md), we saw how the `MessageManager` acts like a secretary, carefully organizing the conversation between the [Agent](01_agent.md) and the LLM. It manages different pieces of information – the browser's current state, the LLM's plan, the results of actions, and more.
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "Telemetry Service"
|
||||
parent: "Browser Use"
|
||||
nav_order: 8
|
||||
---
|
||||
|
||||
# Chapter 8: Telemetry Service - Helping Improve the Project (Optional)
|
||||
|
||||
In the [previous chapter](07_data_structures__views_.md), we explored the essential blueprints (`Data Structures (Views)`) that keep communication clear and consistent between all the parts of `Browser Use`. We saw how components like the [Agent](01_agent.md) and the [Action Controller & Registry](05_action_controller___registry.md) use these blueprints to exchange information reliably.
|
||||
|
||||
@@ -1,3 +1,10 @@
|
||||
---
|
||||
layout: default
|
||||
title: "Browser Use"
|
||||
nav_order: 4
|
||||
has_children: true
|
||||
---
|
||||
|
||||
# Tutorial: Browser Use
|
||||
|
||||
**Browser Use** is a project that allows an *AI agent* to control a web browser and perform tasks automatically.
|
||||
|
||||
Reference in New Issue
Block a user