apify / crawlee-pythonLinks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆7,290Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.☆6,220Updated last week
- Python scraper based on AI☆21,985Updated this week
- Lightweight library for scraping web-sites with LLMs☆1,249Updated 2 months ago
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser sandbox that lets you automate the web wit…☆6,038Updated this week
- Automate browser based workflows with AI☆19,707Updated this week
- Turn any webpage into structured data using LLMs☆6,128Updated last week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,299Updated 3 weeks ago
- Rapidly build AI apps in Python☆6,497Updated this week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆6,007Updated this week
- The first AI agent that builds permissionless integrations through reverse engineering platforms' internal APIs.☆4,505Updated 3 months ago
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,243Updated 9 months ago
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the web☆2,324Updated 6 months ago
- 🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!☆8,312Updated this week
- A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documen…☆2,847Updated this week
- The easiest way to use Agentic RAG in any enterprise☆4,372Updated 10 months ago
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,697Updated 2 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,978Updated last month
- Large Action Model framework to develop AI Web Agents☆6,215Updated 10 months ago
- Build better UIs faster.☆8,926Updated last month
- Swiss-army tool for scraping and extracting data from online assets, made for hackers☆4,574Updated last year
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆9,490Updated 7 months ago
- OCR & Document Extraction using vision models☆11,985Updated 6 months ago
- The AI Browser Automation Framework☆19,444Updated this week
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,748Updated 6 months ago
- A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama☆1,897Updated 3 weeks ago
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,952Updated last week
- Turn any website into clean data pipelines & structured APIs in minutes!☆14,052Updated this week
- Stay on top of trending topics on social media and the web with AI☆3,913Updated 10 months ago
- The fastest way to create an HTML app☆6,736Updated this week
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,918Updated 2 months ago