caltechlibrary / documentaristLinks
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
☆12Updated 3 years ago
Alternatives and similar repositories for documentarist
Users that are interested in documentarist are comparing it to the libraries listed below
Sorting:
- ☆12Updated last year
- Visualize large text collections with WebGL☆27Updated last year
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆80Updated last week
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆17Updated last month
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆34Updated 2 years ago
- 🦁 Nala is an agile open-source voice assistant framework (20+ actions).☆36Updated 2 years ago
- Post-processing OCR errors with seq2seq models☆28Updated 5 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated 9 months ago
- Transcribes and summarizes speech or audio☆36Updated 4 years ago
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated 2 years ago
- Apply different text recognition services to images of handwritten documents.☆188Updated 3 years ago
- Instagram-like filters with deep learning☆57Updated last year
- ☆15Updated last year
- Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.☆27Updated 4 years ago
- Deploy DL/ ML inference pipelines with minimal extra code.☆102Updated last year
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 7 years ago
- Experiments with Hugging Face 🔬 🤗☆45Updated last year
- DFKI Layout Detection for OCR-D☆47Updated 8 months ago
- A tidy and complete archive of metadata for papers on arxiv.org, 1993-2019☆28Updated 6 years ago
- Python tools for Tesseract OCR training☆26Updated 3 years ago
- Transformer-based approaches for an efficient docstrings generation on a piece of Python's code.☆17Updated this week
- Matplotlib Image labeller for classifying images☆11Updated 3 weeks ago
- Document Search Engine Tool☆76Updated 3 years ago
- Finds linguistic patterns effortlessly☆39Updated 2 years ago
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆12Updated 3 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 10 months ago
- A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension☆14Updated 2 years ago
- Another implementation of the paper "Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs" in…☆13Updated 4 years ago
- Deeplearing based Reverse Image Search using Annoy library☆15Updated 6 years ago
- HyperTag - Intuitive Knowledge Management WebApp & CLI for Humans using Deep Learning & Tags☆196Updated 10 months ago