caltechlibrary / dataset
dataset is a command line tool, Go package, shared library and Python package for working with JSON objects as collections
☆24Updated last week
Alternatives and similar repositories for dataset:
Users that are interested in dataset are comparing it to the libraries listed below
- Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives☆15Updated 3 years ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- Trough: Big data, small databases.☆41Updated 9 months ago
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆26Updated 8 months ago
- TimerMetrics captures timings and enables periodic metrics every n events☆15Updated 5 years ago
- Singularity Image Format (SIF) reference implementation.☆19Updated 2 weeks ago
- Wiki-style RecentChanges page for Notion.so databases.☆12Updated 3 years ago
- Golang WARC (Web ARChive) Library☆30Updated 5 years ago
- Tool for running transformations on columns in a SQLite database☆31Updated 3 years ago
- Occult is an open-source, distributed, cache-oriented, array processing architecture for scientific computing.☆29Updated 10 years ago
- Jupyter Kernel for Xonsh☆22Updated 8 years ago
- Makefile and templates to create specifications from Pandoc Markdown syntax☆13Updated 7 months ago
- A highly scalable collector for tricorder applications☆10Updated 7 years ago
- Span formats.☆17Updated last week
- Hash-based password manager☆19Updated 5 years ago
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆24Updated 9 years ago
- Site for publicly archiving content hashes (C port)☆9Updated 3 years ago
- Interactive search of non-indexed data☆19Updated 2 years ago
- ☆12Updated 5 years ago
- ☸️ Hub for executable documents☆32Updated last week
- oldweb.today Remote/Containerized Browser System☆10Updated 6 years ago
- utilities for filesystem exploration and automated builds☆21Updated this week
- High performance multiplexed user fuse mounting☆20Updated 2 years ago
- ⨝ Mandatory Access Control for SQLite databases☆14Updated 8 years ago
- a web based tool to monitor how your website content is used in wikipedia☆37Updated 4 years ago
- CLI implementation of httpreserve that can test links and retrieve internet archive replacements☆10Updated 5 months ago
- Navigating around a grid of cells like XPath for spreadsheets; supports Python 3.5+☆48Updated 2 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- This package contains helpers to deal with physical variables and units.☆12Updated 2 years ago
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆14Updated 2 years ago