nrv / pimmiLinks
Python IMage MIning
☆14Updated 6 months ago
Alternatives and similar repositories for pimmi
Users that are interested in pimmi are comparing it to the libraries listed below
Sorting:
- ☆11Updated 2 months ago
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆24Updated last year
- Awesome AI in Libraries☆16Updated 2 years ago
- A command line utility for listing and searching snapshots in web archives☆16Updated last year
- Web Archiving Course☆23Updated last year
- A tool for collection archival slivers of the web and web archives☆14Updated 7 months ago
- Command line tool for digging into WARC files☆46Updated last week
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆36Updated 4 months ago
- Python package to reconcile DataFrames☆24Updated 2 years ago
- Convert ALTO XML to plain text + minimal metadata☆17Updated 11 months ago
- ☆14Updated 5 months ago
- Command line interface to Wikidata Query Service☆55Updated last year
- A client for the Archive-It And Webrecorder WASAPI Data Transfer API☆16Updated 5 years ago
- A Python library for defining rule-based overrides on messy data☆16Updated 2 weeks ago
- Heritage Connector: Transforming text into data to extract meaning and make connections☆24Updated 2 years ago
- Platform for journalists to search, analyse, categorise and share unstructured data☆55Updated this week
- Static Site Generator for Viewing Web Archives (in WACZ) format☆28Updated 2 years ago
- An experimental Python server for scholarly web annotations☆12Updated 4 years ago
- Browser-based app for segmenting & OCRing PDF pages based on whitespace rules. To assist researchers (especially in the humanities) with …☆12Updated last year
- Dockerized development environment for Omeka S☆10Updated last week
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Updated 7 years ago
- Web application for transcribing OCR ground truth from Archive.org☆17Updated 7 years ago
- DuckDB Engine as Google Sheets Library☆19Updated 9 months ago
- OpenRefine reconciler for Research Organization Registry☆13Updated 5 months ago
- Self hosting code for Recogito-Studio☆19Updated 2 months ago
- Download and attach provenance to public datasets☆34Updated 5 months ago
- A social media open post web archiving tool☆27Updated last week
- Download GitHub repositories☆11Updated 4 months ago
- annotation storage backend☆10Updated 5 months ago
- Extract networks of entities from journalistic reporting☆48Updated 2 years ago