apify / crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
☆17,692Updated this week
Alternatives and similar repositories for crawlee
Users that are interested in crawlee are comparing it to the libraries listed below
Sorting:
- Deploy headless browsers in Docker. Run on our cloud or bring your own. Free for non-commercial uses.☆10,121Updated this week
- Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuz…☆23,022Updated this week
- Visual builder for React. Build apps, websites, and content. Integrate with your codebase.☆5,726Updated this week
- 🧩 The Browser Extension Framework☆11,767Updated this week
- 💯 Teach puppeteer new tricks through plugins.☆6,828Updated 10 months ago
- 🌌 A complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid sear…☆9,387Updated this week
- Visual Development for React, Vue, Svelte, Qwik, and more☆8,117Updated this week
- ✨ The Next Gen Airtable Alternative: No-Code Postgres☆18,377Updated this week
- Write components once, run everywhere. Compiles to React, Vue, Qwik, Solid, Angular, Svelte, and more.☆13,141Updated last week
- 🔥 🔥 🔥 Open Source Airtable Alternative☆54,393Updated this week
- Open-source developer platform to power your entire infra and turn scripts into webhooks, workflows and UIs. Fastest workflow engine (13x…☆13,062Updated this week
- Open Source realtime backend in 1 file☆47,293Updated this week
- Lightpanda: the headless browser designed for AI and automation☆8,919Updated this week
- Low-code platform for building business applications. Connect to databases, cloud storages, GraphQL, API endpoints, Airtable, Google shee…☆35,693Updated this week
- ⬛️ CLI tool and library for saving complete web pages as a single HTML file☆13,634Updated 3 weeks ago
- Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Dow…☆5,665Updated this week
- 🎥 Make videos programmatically with React☆22,261Updated this week
- State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!☆13,610Updated last week
- The world's most flexible commerce platform.☆29,075Updated this week
- Trigger.dev – open source background jobs and AI infrastructure☆11,221Updated this week
- ⚡ Next-gen Web Extension Framework☆6,870Updated this week
- Presentation Slides for Developers☆37,966Updated this week
- Open source Loom alternative. Beautiful, shareable screen recordings.☆9,688Updated this week
- Simple, open source, lightweight and privacy-friendly web analytics alternative to Google Analytics.☆22,439Updated this week
- React Flow | Svelte Flow - Powerful open source libraries for building node-based UIs with React (https://reactflow.dev) or Svelte (https…☆29,510Updated last week
- A curated list of awesome puppeteer resources.☆2,478Updated 10 months ago
- Generate massive amounts of fake data in the browser and node.js☆13,833Updated last week
- Session replay, cobrowsing and product analytics you can self-host. Ideal for reproducing issues and iterating on your product.☆10,093Updated this week
- Next-generation full-text search library for Browser and Node.js☆12,949Updated last week
- Build like a team of hundreds_☆48,663Updated this week