N0taN3rd / node-warcLinks
Parse And Create Web ARChive (WARC) files with node.js
☆103Updated last year
Alternatives and similar repositories for node-warc
Users that are interested in node-warc are comparing it to the libraries listed below
Sorting:
- JS Streaming WARC IO optimized for Browser and Node☆53Updated last week
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆87Updated 4 years ago
- wabac.js - Web Archive Browsing Augmentation Client☆122Updated last week
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆173Updated 5 years ago
- Chrome extension to "Create WARC files from any webpage"☆228Updated 2 months ago
- Specifications developed and maintained by the Webrecorder community.☆140Updated 3 months ago
- Quickly estimate the similarity between many sets☆53Updated 3 years ago
- Automatically extracts structured information from webpages☆112Updated 3 years ago
- React components to render differences between captures at the Wayback Machine☆37Updated 2 weeks ago
- Webrecorder Automated In-Page Behavior Framework☆13Updated 4 years ago
- 🤬 Map of profane words to a rating of sureness☆265Updated 2 years ago
- Get n-grams from text☆84Updated 3 years ago
- visualise readability☆214Updated last year
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆58Updated 5 months ago
- The OpenWayback Development☆510Updated 2 years ago
- Fast Double Metaphone algorithm☆97Updated 3 years ago
- Extended Date Time Format (ISO 8601-2 / EDTF) Parser for JavaScript☆76Updated last month
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆39Updated 2 months ago
- Converts WARC files to static HTML☆51Updated 4 months ago
- Compress json-data based on its json-schema while still having valid json☆98Updated 3 weeks ago
- 📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity☆98Updated 7 years ago
- Apache Annotator provides annotation enabling code for browsers, servers, and humans.☆241Updated last year
- Convert between DOM Range instances and text quotes.☆35Updated 2 years ago
- an image annotation and publication tool☆27Updated 5 years ago
- Natural Language Concrete Syntax Tree format☆229Updated last year
- Snapshots a web page to get it as a static, self-contained HTML document.☆301Updated 3 years ago
- Tunable full text search engine in JavaScript that: (1) works natively on web apps like Express.js; (2) easy to customize (via BM25) to s…☆35Updated 7 years ago
- 🔖 Node.js module for manipulating extended attributes☆68Updated 4 years ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆169Updated 5 months ago
- A list of tools related to W(eb)ARC(hive)☆67Updated 11 years ago