html-extract / hext
Domain-specific language for extracting structured data from HTML documents
☆52Updated last week
Related projects ⓘ
Alternatives and complementary repositories for hext
- experiments in sorting☆25Updated last year
- My personally curated list of bash/command-line commands and snippets that are very useful yet I keep on forgetting☆18Updated 2 years ago
- A collection of visualization projects built on Wikipedia data.☆40Updated last year
- Generating text completions based on the Mueller report☆28Updated 5 years ago
- Browser version of Hyphe (WIP)☆29Updated 3 weeks ago
- API endpoint and UI for blockbuilder search page☆20Updated last year
- A network clustering library for javascript☆34Updated last year
- Trough: Big data, small databases.☆38Updated 3 months ago
- a work-in-progress guide to web scraping as an artistic and critical practice☆78Updated last year
- Visualize the evolution of a file tracked by git☆24Updated 6 years ago
- Extract networks of entities from journalistic reporting☆47Updated last year
- Machine learning model to recommend related content☆19Updated last year
- basically all words, in a compressed form☆16Updated last year
- ☆27Updated 7 years ago
- A lightweight JavaScript client library for the Wikimedia Pageviews API for Wikipedia and various of its sister projects for Node.js and …☆26Updated 3 years ago
- Add website scraping abilities to Datasette☆61Updated last year
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆14Updated 10 months ago
- A fast, simple, memory-efficient graph layout algorithm for visualizing networks in D3☆50Updated 2 years ago
- ☆84Updated 2 years ago
- MediaScape project researching the utility of Generous Interfaces for audiovisual archives☆10Updated last year
- Datawrapper API v3 (in Node)☆13Updated 3 years ago
- Converts svg to gcode for pen plotters☆54Updated 2 years ago
- Because what if you could just... write graphics sketches? On the web? Like, directly?☆17Updated 3 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.☆24Updated last year
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆53Updated 7 years ago
- makes supercuts from youtube searches (alpha)☆12Updated 6 years ago
- Add editing UI and other power-user features to Datasette.☆12Updated last year
- Neo4j powered web application for multimedia collections: bring graph-based exploration and crowd-based indexation.☆23Updated 4 years ago