html-extract / hextLinks
Domain-specific language for extracting structured data from HTML documents
☆54Updated 2 months ago
Alternatives and similar repositories for hext
Users that are interested in hext are comparing it to the libraries listed below
Sorting:
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆87Updated 4 years ago
- My personally curated list of bash/command-line commands and snippets that are very useful yet I keep on forgetting☆19Updated 3 years ago
- experiments in sorting☆27Updated 3 years ago
- a simple graph shell to explore ideas☆117Updated 4 months ago
- Now included in rigour☆152Updated last month
- Twitter, quick. Fetch and store tweets on short notice.☆79Updated 9 years ago
- a work-in-progress guide to web scraping as an artistic and critical practice☆84Updated 2 years ago
- 📑 Read a Google Drive Doc and convert to JSON (via ArchieML)☆22Updated 7 years ago
- A suite of focused and simple tools and activities for journalists, data journalism classrooms and community advocacy groups☆63Updated 3 weeks ago
- Dead simple cron service for making HTTP calls on a regular schedule.☆14Updated 5 years ago
- Binary Python bindings for poppler utils for content extraction☆42Updated 4 years ago
- Datasette plugin for visualizing data using Vega☆61Updated last month
- A library for accessing a spreadsheet as a native Python object suitable for templating.☆226Updated 7 years ago
- Extract networks of entities from journalistic reporting☆49Updated 2 years ago
- An open-source archive that gathers, saves, shares and analyzes news homepages☆151Updated last week
- Computer assisted video/audio transcription☆97Updated 5 years ago
- Pull out versions of specific files from a gitscraping repo into individual files.☆14Updated 4 years ago
- ☆86Updated 3 years ago
- Pre-render Observable notebooks for automation☆62Updated 3 years ago
- Schemas to convert common fixed-width file formats into CSV using in2csv.☆125Updated 4 years ago
- Monitor datasets, gets alerts when something happens☆210Updated 7 years ago
- framework to orchestrate the download and analysis of media☆100Updated 2 years ago
- A Node.js wrapper around the DocumentCloud API.☆12Updated 8 years ago
- video editing and compositing with python and melt☆133Updated 2 years ago
- A data pipeline helper written in node to convert a folder of JS/ArchieML/JSON/YAML/CSV/TSV files into usable data.☆47Updated 2 years ago
- A lightweight JavaScript client library for the Wikimedia Pageviews API for Wikipedia and various of its sister projects for Node.js and …☆27Updated 4 years ago
- NWJS os x desktop based application that given a video/audio file returns a transcription using IBM Watson Speech to text API☆41Updated 8 years ago
- Machine learning model to recommend related content☆19Updated 2 years ago
- 📜 A tiny custom element for all your scrollytelling needs!☆27Updated 3 years ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 4 years ago