apify / crawlee-python
CrawleeβA web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
β5,612Updated this week
Alternatives and similar repositories for crawlee-python:
Users that are interested in crawlee-python are comparing it to the libraries listed below
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β6,171Updated last month
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β6,377Updated 2 months ago
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/β8,663Updated this week
- Turn any webpage into structured data using LLMsβ4,824Updated 8 months ago
- π₯ Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web wiβ¦β4,292Updated this week
- Agent Framework / shim to use Pydantic with LLMsβ9,143Updated this week
- A visual playground for agentic workflows: Iterate over your agents 10x fasterβ4,818Updated last month
- Large Action Model framework to develop AI Web Agentsβ6,036Updated 3 months ago
- Automate browser-based workflows with LLMs and Computer Visionβ13,275Updated this week
- π·οΈ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!β2,972Updated last week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documentsβ5,176Updated this week
- The easiest way to use Agentic RAG in any enterpriseβ4,210Updated 3 months ago
- Agno is a lightweight library for building Agents with memory, knowledge, tools and reasoning.β26,158Updated this week
- Rapidly build AI apps in Pythonβ6,219Updated last week
- Flexible and powerful framework for managing multiple AI agents and handling complex conversationsβ4,931Updated this week
- Build Real-Time Knowledge Graphs for AI Agentsβ8,122Updated this week
- π The fast, Pythonic way to build MCP servers and clientsβ8,458Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundryβ4,036Updated 2 months ago
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β24,223Updated this week
- Toolkit for linearizing PDFs for LLM datasets/trainingβ12,238Updated this week
- The python library for real-time communicationβ3,824Updated 2 weeks ago
- Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.β3,493Updated this week
- Task-Aware Agent-driven Prompt Optimization Frameworkβ3,218Updated last month
- Ingest, parse, and optimize any data format β‘οΈ from documents to multimedia β‘οΈ for enhanced compatibility with GenAI frameworksβ6,507Updated 3 weeks ago
- A language model programming library.β5,754Updated 2 months ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β7,804Updated this week
- π AI search engine - self-host with local or cloud LLMsβ3,296Updated 7 months ago
- Fully local web research and report writing assistantβ7,308Updated last month
- π₯ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.β37,616Updated this week
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simplβ¦β4,187Updated last month