Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆8,612Mar 13, 2026Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- Python scraper based on AI☆23,032Updated this week
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆22,366Updated this week
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆62,080Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆28,006Sep 30, 2025Updated 5 months ago
- 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data☆93,251Updated this week
- Build, run, manage agentic software at scale.☆38,700Updated this week
- Universal memory layer for AI Agents☆50,147Updated this week
- Automate browser based workflows with AI☆20,834Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,477Mar 1, 2026Updated 2 weeks ago
- An open-source RAG-based tool for chatting with your documents.☆25,205Mar 8, 2026Updated last week
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆81,169Updated this week
- Get your documents ready for gen AI☆55,944Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,348Feb 21, 2025Updated last year
- The AI Browser Automation Framework☆21,583Updated this week
- We write your reusable computer vision tools. 💜☆36,705Updated this week
- An autonomous agent that conducts deep research on any data using any LLM providers☆25,718Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆39,597Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆32,617Mar 10, 2026Updated last week
- Vane is an AI-powered answering engine.☆33,063Mar 10, 2026Updated last week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …☆33,400Mar 6, 2026Updated last week
- 🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in min…☆15,249Mar 12, 2026Updated last week
- Self-hosted AI coding assistant☆33,022Mar 2, 2026Updated 2 weeks ago
- SOTA Open Source TTS☆27,364Mar 13, 2026Updated last week
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆46,408Updated this week
- 🙌 OpenHands: AI-Driven Development☆69,254Updated this week
- LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows☆6,487Updated this week
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, al…☆17,236Updated this week
- Large Action Model framework to develop AI Web Agents☆6,318Jan 21, 2025Updated last year
- 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.☆22,971Feb 2, 2026Updated last month
- ✨ The Next Gen Airtable Alternative: No-Code Postgres☆21,031Updated this week
- Turns Data and AI algorithms into production-ready web applications in no time.☆19,113Mar 12, 2026Updated last week
- PraisonAI 🦞 - Your 24/7 AI employee team. Automate and solve complex challenges with low-code multi-agent AI that plans, researches, cod…☆5,665Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆10,284May 8, 2025Updated 10 months ago
- The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configration.☆56,228Updated this week
- aider is AI pair programming in your terminal☆41,939Mar 9, 2026Updated last week
- Turn any webpage into structured data using LLMs☆6,236Mar 3, 2026Updated 2 weeks ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆31,474Updated this week
- Python tool for converting files and office documents to Markdown.☆90,728Mar 10, 2026Updated last week
- 🕸️ Web apps in pure Python 🐍☆28,232Updated this week