NikolaiT / struktur
Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.
β69Updated 3 years ago
Alternatives and similar repositories for struktur:
Users that are interested in struktur are comparing it to the libraries listed below
- 𧱠A uniform template to use as a foundation for Puppeteer bot construction.β65Updated 3 years ago
- Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSOβ¦β150Updated last year
- A test suite of common scraper detection techniques. See how detectable your scraper stack is.β136Updated 2 years ago
- π‘π A conceptual patch which modifies some vanilla puppeteer files to decrease detection rates.β50Updated 3 years ago
- Minimal set of tools to conduct stealthy scraping.β153Updated last year
- NodeJs package for generating browser-like headers.β65Updated 2 years ago
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteeβ¦β92Updated 2 years ago
- DFPM is a browser extension for detecting browser fingerprinting.β114Updated 2 years ago
- Automatically extracts structured information from webpagesβ107Updated 2 years ago
- Cloud crawler functions for scrapeulousβ44Updated 3 years ago
- π Tooling to access Puppeteer's internal Isolated World.β21Updated 3 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.β422Updated 2 years ago
- β109Updated 10 months ago
- Add-ons for Playwright: adblocker, stealth modeβ46Updated 3 years ago
- π΅ββ Bot detection tests for Puppeteer. Hide and seek!β85Updated last year
- Bypassing bot detection checks with Puppeteer.β94Updated 4 years ago
- Proxies Puppeteer Page requests.β204Updated 5 months ago
- A complimentary proxy to help to use SPM with headless browsersβ109Updated last year
- Advanced Node proxy checker (node proxy verifier, node proxy tester) with socks and https supportβ109Updated 2 years ago
- A simple puppeteer wrapper to enable useful plugins with easeβ55Updated this week
- How to detect puppeteer with 100% accuracyβ107Updated 3 years ago
- Fingerprinting script of Fingerprint-Scannerβ241Updated 10 months ago
- Scrapy rotation proxy package with advanced functionsβ94Updated 2 years ago
- A web page that compiles methods used by Akamai, Datadome, and other bot detection solutions and WAF (Web Application Firewall) to identiβ¦β43Updated 3 years ago
- β25Updated 3 years ago
- β13Updated last year
- Dead simple cron service for making HTTP calls on a regular schedule.β14Updated 4 years ago
- Flatten, format, and export any JSON-like data to CSV (or any other string output).β17Updated 3 years ago
- JavaScript code of many commercial bot detectors/fingerprinting services and string deobfuscators for them if applicable.β121Updated 3 years ago
- Chromium / Puppeteer site crawlerβ48Updated 4 years ago