apify / crawlee-pythonLinks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆6,175Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- Python scraper based on AI☆21,023Updated this week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆5,654Updated this week
- Large Action Model framework to develop AI Web Agents☆6,137Updated 6 months ago
- Turn any webpage into structured data using LLMs☆5,953Updated 3 months ago
- Rapidly build AI apps in Python☆6,405Updated 2 months ago
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web wi…☆4,920Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆9,088Updated 3 months ago
- A powerful framework for building realtime voice AI agents 🤖🎙️📹☆7,054Updated this week
- Lightweight library for scraping web-sites with LLMs☆1,208Updated 2 months ago
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,185Updated 5 months ago
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,094Updated 5 months ago
- Automate browser-based workflows with LLMs and Computer Vision☆14,044Updated this week
- Agent Framework / shim to use Pydantic with LLMs☆11,603Updated this week
- 🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!☆6,447Updated this week
- Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.☆5,075Updated this week
- Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.☆3,959Updated this week
- The AI Browser Automation Framework☆16,327Updated this week
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. …☆10,417Updated this week
- The fastest way to create an HTML app☆6,586Updated last week
- A language model programming library.☆5,805Updated 2 months ago
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆48,597Updated this week
- Lightpanda: the headless browser designed for AI and automation☆9,501Updated this week
- Full-stack framework for building Multi-Agent Systems with memory, knowledge and reasoning.☆31,838Updated this week
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the web☆2,321Updated 2 months ago
- ⚡️ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-p…☆9,779Updated this week
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,758Updated 2 weeks ago
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,156Updated last month
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆50,839Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,192Updated this week
- Stay on top of trending topics on social media and the web with AI☆2,929Updated 6 months ago