init push

2026-01-11 10:44:27 +01:00 · 2025-04-04 13:01:50 -04:00
parent 97c20e803a
commit e62ee2cb13
162 changed files with 42423 additions and 11 deletions
--- a/output/Crawl4AI/index.md
+++ b/output/Crawl4AI/index.md
@@ -0,0 +1,52 @@
+# Tutorial: Crawl4AI
+
+`Crawl4AI` is a flexible Python library for *asynchronously crawling websites* and *extracting structured content*, specifically designed for **AI use cases**.
+You primarily interact with the `AsyncWebCrawler`, which acts as the main coordinator. You provide it with URLs and a `CrawlerRunConfig` detailing *how* to crawl (e.g., using specific strategies for fetching, scraping, filtering, and extraction).
+It can handle single pages or multiple URLs concurrently using a `BaseDispatcher`, optionally crawl deeper by following links via `DeepCrawlStrategy`, manage `CacheMode`, and apply `RelevantContentFilter` before finally returning a `CrawlResult` containing all the gathered data.
+
+
+**Source Repository:** [https://github.com/unclecode/crawl4ai/tree/9c58e4ce2ee025debd3f36bf213330bd72b90e46/crawl4ai](https://github.com/unclecode/crawl4ai/tree/9c58e4ce2ee025debd3f36bf213330bd72b90e46/crawl4ai)
+
+```mermaid
+flowchart TD
+    A0["AsyncWebCrawler"]
+    A1["CrawlerRunConfig"]
+    A2["AsyncCrawlerStrategy"]
+    A3["ContentScrapingStrategy"]
+    A4["ExtractionStrategy"]
+    A5["CrawlResult"]
+    A6["BaseDispatcher"]
+    A7["DeepCrawlStrategy"]
+    A8["CacheContext / CacheMode"]
+    A9["RelevantContentFilter"]
+    A0 -- "Configured by" --> A1
+    A0 -- "Uses Fetching Strategy" --> A2
+    A0 -- "Uses Scraping Strategy" --> A3
+    A0 -- "Uses Extraction Strategy" --> A4
+    A0 -- "Produces" --> A5
+    A0 -- "Uses Dispatcher for `arun_m..." --> A6
+    A0 -- "Uses Caching Logic" --> A8
+    A6 -- "Calls Crawler's `arun`" --> A0
+    A1 -- "Specifies Deep Crawl Strategy" --> A7
+    A7 -- "Processes Links from" --> A5
+    A3 -- "Provides Cleaned HTML to" --> A9
+    A1 -- "Specifies Content Filter" --> A9
+```
+
+## Chapters
+
+1. [AsyncCrawlerStrategy](01_asynccrawlerstrategy.md)
+2. [AsyncWebCrawler](02_asyncwebcrawler.md)
+3. [CrawlerRunConfig](03_crawlerrunconfig.md)
+4. [ContentScrapingStrategy](04_contentscrapingstrategy.md)
+5. [RelevantContentFilter](05_relevantcontentfilter.md)
+6. [ExtractionStrategy](06_extractionstrategy.md)
+7. [CrawlResult](07_crawlresult.md)
+8. [DeepCrawlStrategy](08_deepcrawlstrategy.md)
+9. [CacheContext / CacheMode](09_cachecontext___cachemode.md)
+10. [BaseDispatcher](10_basedispatcher.md)
+
+
+---
+
+Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)