apify / crawlee-pythonLinks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆6,256Updated this week
Alternatives and similar repositories for crawlee-python
Users that are interested in crawlee-python are comparing it to the libraries listed below
Sorting:
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,131Updated 6 months ago
- Rapidly build AI apps in Python☆6,416Updated 2 months ago
- Automate browser-based workflows with LLMs and Computer Vision☆14,216Updated this week
- Python scraper based on AI☆21,220Updated 3 weeks ago
- Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.☆5,205Updated last week
- Lightweight library for scraping web-sites with LLMs☆1,210Updated 2 weeks ago
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser sandbox that lets you automate the web wit…☆5,043Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,196Updated last week
- Large Action Model framework to develop AI Web Agents☆6,160Updated 7 months ago
- Easiest no code web data extraction platform. Instantly turn any website into API or spreadsheet.☆13,567Updated this week
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.☆7,039Updated last week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆9,173Updated 4 months ago
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆19,350Updated this week
- A visual playground for agentic workflows: Iterate over your agents 10x faster☆5,441Updated last month
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆5,746Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,243Updated 2 months ago
- The easiest way to use Agentic RAG in any enterprise☆4,315Updated 7 months ago
- Open-source framework for building multi-agent systems with memory, knowledge and reasoning.☆32,722Updated this week
- The AI Browser Automation Framework☆16,707Updated last week
- Turn any webpage into structured data using LLMs☆5,988Updated 3 months ago
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,579Updated 2 months ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. …☆10,547Updated this week
- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extra…☆2,737Updated this week
- A language model programming library.☆5,826Updated 3 months ago
- An AI-powered search engine with a generative UI☆8,070Updated this week
- Neo4j graph construction from unstructured data using LLMs☆3,891Updated last week
- The python library for real-time communication☆4,269Updated this week
- The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥☆55,161Updated this week
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simpl…☆5,340Updated last week
- Build better UIs faster.☆8,860Updated this week