html-extract / hextLinks
Domain-specific language for extracting structured data from HTML documents
β54Updated last month
Alternatives and similar repositories for hext
Users that are interested in hext are comparing it to the libraries listed below
Sorting:
- My personally curated list of bash/command-line commands and snippets that are very useful yet I keep on forgettingβ19Updated 3 years ago
- π Read a Google Drive Doc and convert to JSON (via ArchieML)β22Updated 7 years ago
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior Systemβ87Updated 4 years ago
- A suite of focused and simple tools and activities for journalists, data journalism classrooms and community advocacy groupsβ63Updated 2 months ago
- a simple graph shell to explore ideasβ117Updated 4 months ago
- Computer assisted video/audio transcriptionβ97Updated 5 years ago
- The Datasette macOS applicationβ132Updated last year
- Twitter, quick. Fetch and store tweets on short notice.β79Updated 8 years ago
- experiments in sortingβ27Updated 2 years ago
- generate rules from lists of wordsβ16Updated 4 years ago
- Schemas to convert common fixed-width file formats into CSV using in2csv.β125Updated 4 years ago
- Visualize the evolution of a file tracked by gitβ27Updated 7 years ago
- Browser version of Hyphe (WIP)β31Updated 6 months ago
- a work-in-progress guide to web scraping as an artistic and critical practiceβ83Updated 2 years ago
- A simple utility for SQL-like joins with Json, GeoJson or dbf data in Node, the browser and on the command line. Also creates join reportβ¦β52Updated 2 years ago
- Binary Python bindings for poppler utils for content extractionβ42Updated 4 years ago
- Add website scraping abilities to Datasetteβ66Updated 2 years ago
- Pre-render Observable notebooks for automationβ61Updated 3 years ago
- Predicts likes, comment or total interactions of a facebook page post using machine learningβ10Updated 7 years ago
- Datasette plugin for visualizing data using Vegaβ61Updated last month
- d3 plugin to create a temporal network visualizationβ18Updated 2 years ago
- π A tiny custom element for all your scrollytelling needs!β27Updated 3 years ago
- This project represents the 300-dimensional word vectors from word2vec as JSON.β129Updated 9 years ago
- Snowclone a Minute! You too can write an annoying twitter bot of your choosing.β11Updated 8 years ago
- API endpoint and UI for blockbuilder search pageβ20Updated 2 years ago
- An open-source archive that gathers, saves, shares and analyzes news homepagesβ148Updated last month
- Uses Google Apps Scripts with Google Docs to provide a document tree in JSON exposed on a GET URL for integration into anything.β28Updated 7 years ago
- A network clustering library for javascriptβ35Updated 2 weeks ago
- A library for accessing a spreadsheet as a native Python object suitable for templating.β226Updated 7 years ago
- API implementation, User Interface, and more modules of the IPTC EXTRA projectβ13Updated 3 years ago