teticio / lambda-scraper
Use AWS Lambda functions as a proxy pool to scrape web pages.
β131Updated last year
Alternatives and similar repositories for lambda-scraper
Users that are interested in lambda-scraper are comparing it to the libraries listed below
Sorting:
- Staff fetcher library for LinkedIn - obtain experiences, schools, skills & contact infoβ144Updated last month
- estela, an elastic web scraping cluster πΈβ180Updated 2 months ago
- Get structured JSON data from any page.β175Updated last year
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pacβ¦β276Updated last year
- The Web Scraping Club Free Repositoryβ141Updated 2 weeks ago
- β23Updated 5 months ago
- Minimal set of tools to conduct stealthy scraping.β156Updated 2 years ago
- Patching CDP (Chrome DevTools Protocol) leaks on OS level. Easy to use with Playwright, Selenium, and other web automation tools.β116Updated 8 months ago
- The Architecture of a Web Crawler: Building a Google-Inspired Distributed Web Crawlerβ115Updated 5 months ago
- Asynchronous alternative to the requests-ip-rotator libraryβ40Updated 4 months ago
- G2 Scraper helps you collect G2 product data, including names, product descriptions, reviews, ratings, comparisons, alternatives, and morβ¦β43Updated 4 months ago
- Super Fast, Super Anti-Detect, and Super Intuitive Web Driverβ65Updated last month
- A drop-in replacement for playwright-python patched with rebrowser-patches. It allows to pass modern automation detection tests.β65Updated this week
- A python package for finding e-mails, checking deliverability and more.β64Updated last year
- Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supportβ¦β112Updated 2 years ago
- A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.β39Updated 11 months ago
- Nodriver integration for Scrapyβ16Updated 4 months ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.β430Updated 2 years ago
- Free IP Proxy rotator library for pythonβ236Updated last month
- playwright stealthβ675Updated 9 months ago
- Zillow scraper made in Pythonβ59Updated 5 months ago
- A simple LinkedIn profile scraper implemented as a chrome extensionβ80Updated last year
- TypeScript library for Google search scraping using http requests with proxy support, pagination, and regional customization. Built for wβ¦β35Updated 3 months ago
- Create on demand free HTTPS/SOCKS5 proxy servers using AWS Free Tier EC2 instances automatically with Terraformβ92Updated 2 years ago
- Cloud crawler functions for scrapeulousβ45Updated 4 years ago
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteeβ¦β94Updated 2 years ago
- π Intelligent browser header & fingerprint generatorβ538Updated last month
- Python SEO keywords suggestion tool. Google Autocomplete, People Also Ask and Related Searches.β120Updated 2 years ago
- π§± A uniform template to use as a foundation for Puppeteer bot construction.β66Updated 4 years ago
- β131Updated last year