caltechlibrary / documentaristLinks
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
☆12Updated 3 years ago
Alternatives and similar repositories for documentarist
Users that are interested in documentarist are comparing it to the libraries listed below
Sorting:
- Transcribes and summarizes speech or audio☆37Updated 4 years ago
- Post-processing OCR errors with seq2seq models☆28Updated 5 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated 7 months ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Creating a simple recommendation system on the Basis of similarity☆11Updated 7 years ago
- ☆12Updated last year
- Finds linguistic patterns effortlessly☆39Updated 2 years ago
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 7 years ago
- Reproducing "Writing with Transformer" demo, using aitextgen/FastAPI in backend, Quill/React in frontend☆27Updated 4 years ago
- DFKI Layout Detection for OCR-D☆47Updated 6 months ago
- Experiments with Hugging Face 🔬 🤗☆44Updated last year
- Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.☆27Updated 4 years ago
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆18Updated 2 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- Apply different text recognition services to images of handwritten documents.☆187Updated 2 years ago
- Deeplearing based Reverse Image Search using Annoy library☆15Updated 6 years ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- Identification of crop diseases and pests using Deep Learning framework from the images.☆28Updated 7 years ago
- A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension☆14Updated 2 years ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated 2 years ago
- ☆20Updated 4 years ago
- A Recommendation Engine API that can be used to recommend movies, music, games, manga, anime, comics, tv shows and books. Deployed using …☆16Updated 2 years ago
- Using Conditional Random Fields for segmenting Latin words written in scriptio continua☆10Updated 7 years ago
- Batch processing using joblib including tqdm progress bars☆20Updated 3 years ago
- Utilities for working with videos☆13Updated 4 months ago
- Next generation OCR engine based on LSTMs.☆52Updated 7 years ago
- Python wrapper for xpdf☆19Updated 6 years ago
- Text classification automl☆21Updated 4 years ago
- Deploy DL/ ML inference pipelines with minimal extra code.☆102Updated last year
- DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confiden…☆26Updated 4 years ago