amerkurev / scrapper
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
β205Updated 6 months ago
Alternatives and similar repositories for scrapper:
Users that are interested in scrapper are comparing it to the libraries listed below
- A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker suppβ¦β226Updated this week
- Get Google, Yandex, Baidu search engine results via API or CLI for free πβ392Updated 9 months ago
- Free IP Proxy rotator library for pythonβ199Updated this week
- converts url content into JSON with a simple prefixβ67Updated 9 months ago
- Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extrβ¦β95Updated last week
- A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.β33Updated 9 months ago
- β Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are complβ¦β334Updated last week
- Dabarqus is incredibly fast RAG that runs everywhere.β56Updated 2 weeks ago
- The Web Scraping Club Free Repositoryβ136Updated 3 months ago
- β256Updated 2 months ago
- GoalChain for goal-orientated LLM conversation flowsβ68Updated 2 months ago
- Undetected Web-Scraping & Seamless HTML Parsing in Python!β214Updated 2 weeks ago
- n8n node to interact with browserless instanceβ120Updated 4 months ago
- Spider ported to Pythonβ66Updated 3 weeks ago
- Use AWS Lambda functions as a proxy pool to scrape web pages.β127Updated last year
- AutoBrowse is an autonomous AI agent that can perform web browsing tasks.β82Updated last year
- β76Updated 3 weeks ago
- Import unstructured data (text and images) into structured tablesβ146Updated 6 months ago
- Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']β451Updated last month
- β98Updated 4 months ago
- ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The boβ¦β82Updated last year
- πΈ The open framework for question answering fine-tuning LLMs on private dataβ69Updated last year
- A Function Calls Proxy for Groq, the fastest AI alive!β187Updated 10 months ago
- β46Updated last week
- A full stack app for interruptible, low-latency and near-human quality AI phone calls built from stitching LLMs, speech understanding tooβ¦β114Updated 6 months ago
- Super Fast, Super Anti-Detect, and Super Intuitive Web Driverβ51Updated 5 months ago
- n8n node for browser automation using Puppeteerβ145Updated this week
- Ask a directory of files questions. Powered by ChromaDB and ChatGPTβ12Updated last year
- β24Updated 10 months ago
- Simple Graph Memory for AI applicationsβ81Updated 6 months ago