Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆8,108Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- Python scraper based on AI☆22,786Updated this week
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆60,971Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,918Sep 30, 2025Updated 4 months ago
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆21,859Updated this week
- 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data☆84,899Updated this week
- Automate browser based workflows with AI☆20,530Updated this week
- The programming language for agentic software. Build, run, and manage multi-agent systems at scale.☆38,104Updated this week
- Universal memory layer for AI Agents☆47,994Updated this week
- An open-source RAG-based tool for chatting with your documents.☆25,152Jul 4, 2025Updated 7 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,360Updated this week
- Get your documents ready for gen AI☆54,094Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,278Feb 21, 2025Updated last year
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆79,028Updated this week
- The AI Browser Automation Framework☆21,261Updated this week
- Turns Data and AI algorithms into production-ready web applications in no time.☆19,090Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆37,083Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …☆32,573Updated this week
- ✨ The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minu…☆15,079Updated this week
- An autonomous agent that conducts deep research on any data using any LLM providers.☆25,376Updated this week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆6,438Updated this week
- Perplexica is an AI-powered answering engine.☆29,068Feb 13, 2026Updated 2 weeks ago
- Convert PDF to markdown + JSON quickly with high accuracy☆31,857Feb 9, 2026Updated 2 weeks ago
- We write your reusable computer vision tools. 💜☆36,543Updated this week
- SOTA Open Source TTS☆24,983Feb 2, 2026Updated 3 weeks ago
- Large Action Model framework to develop AI Web Agents☆6,303Jan 21, 2025Updated last year
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆44,662Updated this week
- 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.☆22,726Feb 2, 2026Updated 3 weeks ago
- 🙌 OpenHands: AI-Driven Development☆68,154Updated this week
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simpl…☆5,598Feb 19, 2026Updated last week
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, al…☆16,973Updated this week
- Self-hosted AI coding assistant☆32,939Updated this week
- ✨ The Next Gen Airtable Alternative: No-Code Postgres☆20,947Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆9,928May 8, 2025Updated 9 months ago
- Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.☆23,218Oct 28, 2025Updated 3 months ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆31,031Feb 20, 2026Updated last week
- The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.☆54,878Feb 21, 2026Updated last week
- 🕸️ Web apps in pure Python 🐍☆28,152Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,182Updated this week
- A framework for building realtime voice AI agents 🤖🎙️📹☆9,441Updated this week