N0taN3rd / node-warcLinks
Parse And Create Web ARChive (WARC) files with node.js
☆103Updated last year
Alternatives and similar repositories for node-warc
Users that are interested in node-warc are comparing it to the libraries listed below
Sorting:
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆172Updated 5 years ago
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆87Updated 4 years ago
- wabac.js - Web Archive Browsing Augmentation Client☆120Updated this week
- JS Streaming WARC IO optimized for Browser and Node☆53Updated this week
- Automatically extracts structured information from webpages☆112Updated 3 years ago
- Chrome extension to "Create WARC files from any webpage"☆227Updated last month
- Extended Date Time Format (ISO 8601-2 / EDTF) Parser for JavaScript☆76Updated 3 weeks ago
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆56Updated 8 years ago
- React components to render differences between captures at the Wayback Machine☆37Updated last week
- generate rules from lists of words☆16Updated 4 years ago
- Webrecorder Automated In-Page Behavior Framework☆13Updated 4 years ago
- Fast Metaphone implementation☆53Updated 3 years ago
- 🤬 Map of profane words to a rating of sureness☆265Updated 2 years ago
- The OpenWayback Development☆507Updated 2 years ago
- An implementation of LevelDOWN that uses Amazon S3. Turn your S3 bucket into a DB☆62Updated last week
- Formula to detect ease of reading according to the Automated Readability Index (1967)☆52Updated 3 years ago
- English Part-of-speech (POS) tagger☆70Updated 2 years ago
- Compress json-data based on its json-schema while still having valid json☆98Updated 2 weeks ago
- Accurate and fast sentiment scoring of phrases with #hashtags, emoticons :) & emojis 🎉☆62Updated 2 years ago
- neato compression for key-value data☆110Updated last year
- Natural Language Concrete Syntax Tree format☆229Updated last year
- Specifications developed and maintained by the Webrecorder community.☆140Updated 3 months ago
- 🔖 Node.js module for manipulating extended attributes☆68Updated 4 years ago
- LIbrary to load JSON-LD from stdin, URLs, or files.☆42Updated 2 years ago
- English NLP for Node.js and the browser.☆87Updated 2 years ago
- Quickly estimate the similarity between many sets☆53Updated 3 years ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆39Updated 2 months ago
- Converts WARC files to static HTML☆49Updated 4 months ago
- Fast Porter stemmer implementation☆135Updated 3 years ago
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆57Updated 5 months ago