internetarchive / pdf_trio
A PDF classifier ensemble with REST API service
☆23Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for pdf_trio
- Open Access PDF harvester☆35Updated 6 months ago
- Trough: Big data, small databases.☆40Updated 3 months ago
- A browser extension providing Open Access bibliographical services☆14Updated last year
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆25Updated 3 months ago
- WASAPI data transfer APIs☆42Updated 2 years ago
- Specification for authentication and creating signed WACZ Files☆9Updated 2 years ago
- Adding links to full text in Wikipedia references☆37Updated 10 months ago
- Processing OpenCitations Data☆17Updated 7 years ago
- wrapper for the crossref events api☆17Updated last year
- MOVED to https://gitlab.com/crossref/reference_matching_evaluation_framework☆17Updated 5 years ago
- Perpetual Access To The Scholarly Record☆115Updated 3 months ago
- Citation Classification using hybrid neural network model for Wikipedia References☆28Updated last year
- Sort-friendly URI Reordering Transform (SURT) python module☆40Updated 3 months ago
- Process, enhance and evaluate multiple OCR output.☆21Updated 3 weeks ago
- Make MP3 albums out of Academic PDFs. Works by gluing together Grobid and TTS offerings.☆12Updated 11 months ago
- curation workflow automation and coordination☆41Updated 2 months ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- Web application for distributed compute analysis of Archive-It web archive collections.☆15Updated 2 months ago
- Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery☆53Updated 4 months ago
- OpenRefine reconciler for Research Organization Registry☆13Updated last year
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆46Updated 2 years ago
- Service for creating Twitter datasets for research and archiving.☆26Updated last year
- OpenCitations provides in RDF accurate citation information harvested from the scholarly literature.☆64Updated 6 years ago
- Impactstory: the next generation☆77Updated 3 years ago
- A Rails engine supporting the discovery of web archives.☆49Updated last year
- A Memento Aggregator CLI and Server in Go☆57Updated 6 months ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆95Updated 2 years ago
- Softcite software mention recognizer, finding mentions and citations to software from within the academic literature☆68Updated 3 weeks ago
- Digital Preservation of HTTP in documentary heritage.☆22Updated last year
- 💠 + 📚 OpenRefine on Binder!☆40Updated 5 months ago