apify / crawlee-pythonLinks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆7,998Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- Python scraper based on AI☆22,517Updated last week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,273Updated 11 months ago
- Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.☆6,496Updated last month
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,316Updated 2 months ago
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser sandbox that lets you automate the web wit…☆6,322Updated last week
- Rapidly build AI apps in Python☆6,519Updated last week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆6,088Updated this week
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆21,543Updated this week
- Turn any webpage into structured data using LLMs☆6,176Updated 2 months ago
- Automate browser based workflows with AI☆20,305Updated this week
- Turns Data and AI algorithms into production-ready web applications in no time.☆19,067Updated this week
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, al…☆16,679Updated this week
- Lighter web automation with Python☆8,210Updated last week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆9,728Updated 8 months ago
- Python APIs for web automation, testing, and bypassing bot-detection with ease.☆12,146Updated last week
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,794Updated last month
- OCR & Document Extraction using vision models☆12,070Updated 8 months ago
- A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.☆10,486Updated last week
- Lightweight library for scraping web-sites with LLMs☆1,262Updated last month
- A language model programming library.☆5,878Updated 8 months ago
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,971Updated last month
- Build better UIs faster.☆8,949Updated 3 months ago
- Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.☆4,182Updated this week
- Simple, unified interface to multiple Generative AI providers☆13,425Updated last month
- Large Action Model framework to develop AI Web Agents☆6,284Updated last year
- ✨ The Next Gen Airtable Alternative: No-Code Postgres☆20,783Updated last week
- The AI Browser Automation Framework☆20,670Updated last week
- The first AI agent that builds permissionless integrations through reverse engineering platforms' internal APIs.☆4,535Updated 5 months ago
- GenAI Agent Framework, the Pydantic way☆14,536Updated last week
- 🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!☆8,887Updated this week