scrapoxy / scraping-workshop
☆83Updated 4 months ago
Alternatives and similar repositories for scraping-workshop:
Users that are interested in scraping-workshop are comparing it to the libraries listed below
- Undetected Web-Scraping & Seamless HTML Parsing in Python!☆239Updated 2 months ago
- Staff fetcher library for LinkedIn - obtain experiences, schools, skills & contact info☆137Updated 3 weeks ago
- Automated web scraping spider generation using Browser Use and LLMs. Streamline the creation of Playwright-based spiders with minimal man…☆55Updated this week
- The Web Scraping Club Free Repository☆139Updated 5 months ago
- A drop-in replacement for playwright-python patched with rebrowser-patches. It allows to pass modern automation detection tests.☆63Updated 4 months ago
- Free IP Proxy rotator library for python☆231Updated 3 weeks ago
- Unflare helps you to bypass Cloudflare protection☆86Updated last week
- Super Fast, Super Anti-Detect, and Super Intuitive Web Driver☆60Updated last week
- A low-code data extractor for websites with built in proxy and parsing capabilities. Great for testing and debugging css selectors☆180Updated 7 months ago
- A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker supp…☆370Updated this week
- ☆23Updated 4 months ago
- aiohttp-like interface to chromium. based on selenium_driverless to bypass cloudflare☆53Updated 5 months ago
- Detects the presence of anti-bot and fingerprinting technologies on websites by analyzing requests, headers, cookies, and more. Built on …☆44Updated 6 months ago
- Patching CDP (Chrome DevTools Protocol) leaks on OS level. Easy to use with Playwright, Selenium, and other web automation tools.☆112Updated 8 months ago
- Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.☆149Updated last month
- Open Source LinkedIn Scraper☆87Updated 3 months ago
- Curated list of everything related to captchas, including libraries, solvers and scoring☆27Updated 9 months ago
- A fork of https://github.com/AtuboDad/playwright_stealth☆85Updated this week
- ☆29Updated 6 months ago
- Bypass Cloudflare's /h/b/jsd challenge using 100% python☆188Updated this week
- A blazing-fast Python HTTP Client with TLS fingerprint☆299Updated last week
- 🎭 Intelligent browser header & fingerprint generator☆508Updated last month
- A Collection of 10.000 collected Windows Chrome Fingerprints. Usable with an easy-to-use API, available as a compressed (lzma) or full-si…☆220Updated 4 months ago
- The Python toolkit for converting Reddit threads into organized text data. Extract and process Reddit content with ease!☆95Updated 8 months ago
- 🚀 Web scraping for humans☆844Updated 4 months ago
- Undetected Python version of the Playwright testing and automation library.☆430Updated this week
- Spider ported to Python☆77Updated 2 months ago
- Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extr…☆163Updated last week
- Camoufox Integration For ScrapyUpdated 3 months ago
- 🦉Gracefully face reCAPTCHA challenge with ultralytics YOLOv8-seg, CLIPs VIT-B/16 and CLIP-Seg/RD64. Implemented in playwright or an easy…☆164Updated 3 months ago