N0taN3rd / node-warcLinks
Parse And Create Web ARChive (WARC) files with node.js
☆103Updated 11 months ago
Alternatives and similar repositories for node-warc
Users that are interested in node-warc are comparing it to the libraries listed below
Sorting:
- wabac.js - Web Archive Browsing Augmentation Client☆117Updated last month
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆172Updated 5 years ago
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆87Updated 4 years ago
- JS Streaming WARC IO optimized for Browser and Node☆52Updated 3 months ago
- Chrome extension to "Create WARC files from any webpage"☆227Updated last month
- Convert between DOM Range instances and text quotes.☆35Updated 2 years ago
- Quickly estimate the similarity between many sets☆53Updated 3 years ago
- Automatically extracts structured information from webpages☆110Updated 3 years ago
- generate rules from lists of words☆16Updated 4 years ago
- Wombat.js client-side rewriting library☆110Updated last month
- Specifications developed and maintained by the Webrecorder community.☆138Updated 2 months ago
- Local reverse geocoder for Node.js based on GeoNames data☆208Updated last year
- Extended Date Time Format (ISO 8601-2 / EDTF) Parser for JavaScript☆76Updated this week
- A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and …☆56Updated 2 years ago
- A list of tools related to W(eb)ARC(hive)☆67Updated 11 years ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆39Updated last month
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆56Updated 8 years ago
- Storex Core - A modular and portable database abstraction ecosystem for JavaScript☆155Updated last month
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆57Updated 4 months ago
- Convert between DOM Range instances and text positions.☆26Updated 5 years ago
- English NLP for Node.js and the browser.☆87Updated 2 years ago
- Node bindings for Annoy, an efficient Approximate Nearest Neighbors implementation written in C++.☆82Updated 2 years ago
- 🔖 Node.js module for manipulating extended attributes☆68Updated 4 years ago
- atjson is a living content format for annotating content☆226Updated last week
- React components to render differences between captures at the Wayback Machine☆37Updated last week
- LIbrary to load JSON-LD from stdin, URLs, or files.☆42Updated 2 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆54Updated last month
- mbox file parser for Node.js☆72Updated 5 years ago
- Cypher graph database for Javascript☆66Updated 7 months ago
- Apache Annotator provides annotation enabling code for browsers, servers, and humans.☆240Updated last year