archivesunleashed / twut
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.
☆9Updated 4 months ago
Alternatives and similar repositories for twut:
Users that are interested in twut are comparing it to the libraries listed below
- A gathering of digital methods recipes for research, teaching and collaborations from across the Public Data Lab.☆11Updated last year
- Service for creating Twitter datasets for research and archiving.☆26Updated 2 years ago
- A LevelDB backed URL unshortening microservice written in JavaScript☆31Updated 2 years ago
- A collection of ipython/jupyter notebooks☆16Updated 6 years ago
- Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archive…☆26Updated 2 years ago
- A simple catalog of Twitter ID Datasets☆28Updated 4 months ago
- ☆12Updated last year
- Web application for distributed compute analysis of Archive-It web archive collections.☆17Updated last month
- A Twitter data collection and appraisal application.☆51Updated 2 years ago
- Ask questions about government data.☆37Updated 6 years ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- A digital humanities operating system that runs on a USB disk.☆31Updated 7 years ago
- Named-Entity Recognition extension for OpenRefine☆27Updated 2 years ago
- command line resource for working with digital primary sources☆27Updated 6 years ago
- ☆14Updated 8 years ago
- A tool for working with tweet archives.☆15Updated 2 years ago
- Extract case law citations with Node☆57Updated 10 years ago
- WASAPI data transfer APIs☆44Updated 2 years ago
- Specification for authentication and creating signed WACZ Files☆10Updated 3 years ago
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆24Updated 7 years ago
- OpenRefine reconciler for Research Organization Registry☆13Updated 2 weeks ago
- A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (me…☆14Updated 3 years ago
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆13Updated last month
- Webrecorder Automated In-Page Behavior Framework☆13Updated 3 years ago
- 🔎 Finds fuzzy matches between datasets☆12Updated 2 months ago
- Open Access PDF harvester☆39Updated 11 months ago
- H2O is a web app for creating and reading open educational resources, primarily in the legal field☆38Updated 2 months ago
- Humanities Data Curation Record☆11Updated 7 years ago
- ☆15Updated 2 years ago
- Web Archives for Historical Research☆13Updated 7 years ago