caltechlibrary / documentarist
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
☆12Updated 2 years ago
Alternatives and similar repositories for documentarist:
Users that are interested in documentarist are comparing it to the libraries listed below
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆16Updated 2 weeks ago
- ☆16Updated 10 months ago
- Post-processing OCR errors with seq2seq models☆28Updated 4 years ago
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 6 years ago
- Generate variations of text through synonym matching☆12Updated 7 years ago
- Finds linguistic patterns effortlessly☆36Updated last year
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated last year
- Uses Beautiful Soup to read Wiki pages, Gensim to summarize, NLTK to process, and extracts keywords based on entropy: everything in one b…☆9Updated 4 years ago
- An ongoing series of notebooks aimed at helping fellow NLP enthusiasts think about applying new tools and techniques to practical tasks.☆18Updated 4 years ago
- Wrapper around pixel classifier☆9Updated 3 years ago
- Poetic processing, for Python.☆40Updated 11 months ago
- A selection of test lines of several early printed books as well as the corresponding individual OCRopus models and mixed models.☆10Updated 7 years ago
- Tools for using OpenAI Codex to do various useful things☆48Updated 3 years ago
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- Python wrapper for xpdf☆19Updated 5 years ago
- Example of building a working Spanish-to-English translation model with Marian NMT☆22Updated 4 years ago
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- Using Conditional Random Fields for segmenting Latin words written in scriptio continua☆10Updated 6 years ago
- Telegram-bot for NLP/RL courses☆11Updated 2 years ago
- Tool for the Automatic Assessment of Lexical Diversity☆11Updated 4 years ago
- Creating a simple recommendation system on the Basis of similarity☆10Updated 6 years ago
- Text classification automl☆21Updated 3 years ago
- GreenLIT: Using GPT-J with Multi-Task Learning to Create New Screenplays☆17Updated 2 years ago
- Generative, iterative and algorithmic music creation☆8Updated 3 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆53Updated last year
- Given a text, wrap it into phrases and send them to Yandex's search engine. If it yields a "did you mean:", substitute the original phras…☆11Updated 6 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 2 years ago
- ☆12Updated 8 months ago
- Extract knowledge from raw text☆13Updated 3 years ago