Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆8,286Mar 6, 2026Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- Python scraper based on AI☆22,845Feb 24, 2026Updated last week
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆61,332Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,949Sep 30, 2025Updated 5 months ago
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆22,013Updated this week
- 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data☆89,344Updated this week
- Automate browser based workflows with AI☆20,629Updated this week
- Build, run, manage agentic software at scale.☆38,516Updated this week
- Universal memory layer for AI Agents☆48,604Updated this week
- An open-source RAG-based tool for chatting with your documents.☆25,168Mar 2, 2026Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,392Mar 1, 2026Updated last week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,343Feb 21, 2025Updated last year
- Get your documents ready for gen AI☆54,754Updated this week
- The AI Browser Automation Framework☆21,356Updated this week
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆79,644Updated this week
- Turns Data and AI algorithms into production-ready web applications in no time.☆19,096Updated this week
- 🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in min…☆15,202Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆37,994Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …☆33,279Updated this week
- An autonomous agent that conducts deep research on any data using any LLM providers☆25,577Mar 1, 2026Updated last week
- LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows☆6,477Updated this week
- Perplexica is an AI-powered answering engine.☆30,120Feb 13, 2026Updated 3 weeks ago
- We write your reusable computer vision tools. 💜☆36,612Mar 2, 2026Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆32,069Mar 1, 2026Updated last week
- SOTA Open Source TTS☆25,154Updated this week
- Large Action Model framework to develop AI Web Agents☆6,311Jan 21, 2025Updated last year
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆45,147Updated this week
- 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.☆22,891Feb 2, 2026Updated last month
- 🙌 OpenHands: AI-Driven Development☆68,459Updated this week
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, al…☆17,068Updated this week
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simpl…☆5,638Updated this week
- Self-hosted AI coding assistant☆32,982Updated this week
- ✨ The Next Gen Airtable Alternative: No-Code Postgres☆20,984Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆10,096May 8, 2025Updated 10 months ago
- Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.☆23,318Oct 28, 2025Updated 4 months ago
- The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configration.☆55,868Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆31,296Updated this week
- 🕸️ Web apps in pure Python 🐍☆28,187Feb 23, 2026Updated last week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,206Mar 1, 2026Updated last week
- A framework for building realtime voice AI agents 🤖🎙️📹☆9,562Updated this week