Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆8,724Apr 3, 2026Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python scraper based on AI☆23,249Updated this week
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆22,661Apr 1, 2026Updated last week
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆63,500Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆28,051Sep 30, 2025Updated 6 months ago
- 🔥 The Web Data API for AI - Power AI agents with clean web data☆104,217Updated this week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Build, run, manage agentic software at scale.☆39,153Updated this week
- Automate browser based workflows with AI☆21,068Updated this week
- Universal memory layer for AI Agents☆52,137Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,557Updated this week
- An open-source RAG-based tool for chatting with your documents.☆25,251Updated this week
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆86,467Updated this week
- Get your documents ready for gen AI☆57,163Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,343Feb 21, 2025Updated last year
- The SDK For Browser Agents☆21,897Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- We write your reusable computer vision tools. 💜☆37,644Apr 1, 2026Updated last week
- An autonomous agent that conducts deep research on any data using any LLM providers☆26,202Mar 14, 2026Updated 3 weeks ago
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆41,858Apr 2, 2026Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆33,352Updated this week
- Vane is an AI-powered answering engine.☆33,659Mar 27, 2026Updated last week
- 🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in min…☆15,323Mar 30, 2026Updated last week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …☆33,854Mar 26, 2026Updated 2 weeks ago
- Self-hosted AI coding assistant☆33,311Mar 2, 2026Updated last month
- SOTA Open Source TTS☆29,048Mar 30, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows☆6,522Updated this week
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆48,311Updated this week
- 🙌 OpenHands: AI-Driven Development☆70,666Updated this week
- Large Action Model framework to develop AI Web Agents☆6,318Jan 21, 2025Updated last year
- Run agents that work for you based on what you do. AI finally knows what you are doing☆18,022Updated this week
- 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.☆23,201Feb 2, 2026Updated 2 months ago
- ✨ The Next Gen Airtable Alternative: No-Code Postgres☆21,094Updated this week
- Turns Data and AI algorithms into production-ready web applications in no time.☆19,153Apr 2, 2026Updated last week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆10,455May 8, 2025Updated 11 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- PraisonAI 🦞 - Your 24/7 AI employee team. Automate and solve complex challenges with low-code multi-agent AI that plans, researches, cod…☆6,626Updated this week
- The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.☆57,599Updated this week
- aider is AI pair programming in your terminal☆42,778Mar 17, 2026Updated 3 weeks ago
- Turn any webpage into structured data using LLMs☆6,253Mar 27, 2026Updated last week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆31,971Updated this week
- Python tool for converting files and office documents to Markdown.☆93,259Mar 30, 2026Updated last week
- 🕸️ Web apps in pure Python 🐍☆28,268Updated this week