Files
Tutorial-Codebase-Knowledge/output/Crawl4AI/index.md
zachary62 e62ee2cb13 init push
2025-04-04 13:01:50 -04:00

52 lines
2.3 KiB
Markdown

# Tutorial: Crawl4AI
`Crawl4AI` is a flexible Python library for *asynchronously crawling websites* and *extracting structured content*, specifically designed for **AI use cases**.
You primarily interact with the `AsyncWebCrawler`, which acts as the main coordinator. You provide it with URLs and a `CrawlerRunConfig` detailing *how* to crawl (e.g., using specific strategies for fetching, scraping, filtering, and extraction).
It can handle single pages or multiple URLs concurrently using a `BaseDispatcher`, optionally crawl deeper by following links via `DeepCrawlStrategy`, manage `CacheMode`, and apply `RelevantContentFilter` before finally returning a `CrawlResult` containing all the gathered data.
**Source Repository:** [https://github.com/unclecode/crawl4ai/tree/9c58e4ce2ee025debd3f36bf213330bd72b90e46/crawl4ai](https://github.com/unclecode/crawl4ai/tree/9c58e4ce2ee025debd3f36bf213330bd72b90e46/crawl4ai)
```mermaid
flowchart TD
A0["AsyncWebCrawler"]
A1["CrawlerRunConfig"]
A2["AsyncCrawlerStrategy"]
A3["ContentScrapingStrategy"]
A4["ExtractionStrategy"]
A5["CrawlResult"]
A6["BaseDispatcher"]
A7["DeepCrawlStrategy"]
A8["CacheContext / CacheMode"]
A9["RelevantContentFilter"]
A0 -- "Configured by" --> A1
A0 -- "Uses Fetching Strategy" --> A2
A0 -- "Uses Scraping Strategy" --> A3
A0 -- "Uses Extraction Strategy" --> A4
A0 -- "Produces" --> A5
A0 -- "Uses Dispatcher for `arun_m..." --> A6
A0 -- "Uses Caching Logic" --> A8
A6 -- "Calls Crawler's `arun`" --> A0
A1 -- "Specifies Deep Crawl Strategy" --> A7
A7 -- "Processes Links from" --> A5
A3 -- "Provides Cleaned HTML to" --> A9
A1 -- "Specifies Content Filter" --> A9
```
## Chapters
1. [AsyncCrawlerStrategy](01_asynccrawlerstrategy.md)
2. [AsyncWebCrawler](02_asyncwebcrawler.md)
3. [CrawlerRunConfig](03_crawlerrunconfig.md)
4. [ContentScrapingStrategy](04_contentscrapingstrategy.md)
5. [RelevantContentFilter](05_relevantcontentfilter.md)
6. [ExtractionStrategy](06_extractionstrategy.md)
7. [CrawlResult](07_crawlresult.md)
8. [DeepCrawlStrategy](08_deepcrawlstrategy.md)
9. [CacheContext / CacheMode](09_cachecontext___cachemode.md)
10. [BaseDispatcher](10_basedispatcher.md)
---
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)