caltechlibrary / documentarist
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
☆12Updated 2 years ago
Alternatives and similar repositories for documentarist:
Users that are interested in documentarist are comparing it to the libraries listed below
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 6 years ago
- Tool for sentiment analysis annotation☆12Updated 4 months ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆15Updated this week
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated last year
- Example of building a working Spanish-to-English translation model with Marian NMT☆20Updated 4 years ago
- Uses Beautiful Soup to read Wiki pages, Gensim to summarize, NLTK to process, and extracts keywords based on entropy: everything in one b…☆9Updated 4 years ago
- Full-featured Algorithmic Intelligence Music Augmentator (AIMA) with full multi-instrument MIDI output and Karaoke support.☆9Updated 4 years ago
- The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques☆29Updated 4 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated 3 weeks ago
- Tools for evaluating OCR performance relative to ground truth.☆10Updated last year
- ☆10Updated 5 years ago
- Visualize large text collections with WebGL☆25Updated 5 months ago
- Finds linguistic patterns effortlessly☆35Updated last year
- ☆16Updated 8 months ago
- DFKI Layout Detection for OCR-D☆47Updated 3 months ago
- ☆13Updated last year
- Python Script to scrape through MIT OpenCourseWare website to download Course Materials.☆11Updated 7 years ago
- ☆15Updated 3 years ago
- Wrapper around pixel classifier☆9Updated 2 years ago
- A system for reading scanned documents and grouping them into high level topics☆16Updated 4 years ago
- 'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy.☆22Updated 6 years ago
- Fast and accurate natural language detection. Detector written in Python. Nito-ELD, ELD.☆15Updated last year
- ☆12Updated 6 months ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year
- Embedding Visualizer (EmbedViz) data app made with Streamlit library☆22Updated 4 years ago
- Python tools for Tesseract OCR training☆25Updated 2 years ago
- AI Starter Kit for Synthetic Voice and Audio Generation using Intel® Extension for Pytorch☆2Updated last year
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- Tools for using OpenAI Codex to do various useful things☆48Updated 3 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆54Updated last year