wjbmattingly / dots.ocrLinks

Multilingual Document Layout Parsing in a Single Vision-Language Model

☆56

Alternatives and similar repositories for dots.ocr

Users that are interested in dots.ocr are comparing it to the libraries listed below

Sorting:

s-emanuilov / litepali
LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.
☆122Updated last year
kturung / colpali-llama-vision-rag
☆114Updated last year
tonywu71 / colpali-cookbooks
Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻‍🍳
☆352Updated 8 months ago
docling-project / docling-ibm-models
☆185Updated 2 weeks ago
huggingface / huggingface-gemma-recipes
Inference, Fine Tuning and many more recipes with Gemma family of models
☆279Updated 6 months ago
alvarobartt / hf-mem
A CLI to estimate inference memory requirements for Hugging Face models, written in Python.
☆683Updated last week
felixdittrich92 / OnnxTR
OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR
☆171Updated last week
qubvel / transformers-notebooks
Inference and fine-tuning examples for vision models from 🤗 Transformers
☆165Updated 6 months ago
roboflow / model-leaderboard
Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.
☆99Updated this week
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆305Updated 8 months ago
ariG23498 / gemma3-object-detection
Fine tune Gemma 3 on an object detection task
☆97Updated 6 months ago
docling-project / docling-core
Docling core data types and transformations
☆225Updated this week
Vishnunkumar / craft_hw_ocr
Recognition of handwritten text using CRAFT text detection and TrOCR
☆26Updated 3 years ago
abhishekkrthakur / aiaio
lightweight, python based chat ui
☆342Updated 2 months ago
numindai / NuMarkdown
☆198Updated 6 months ago
patrickloeber / genai-tutorials
Code examples showing how to use Gemini, Gemma, Imagen, and more.
☆50Updated 3 weeks ago
LynnHaDo / Checkbox-Detection
Checkbox Detection Model for Scanned Documents
☆91Updated 11 months ago
docling-project / docling-parse
Simple package to extract text with coordinates from programmatic PDFs
☆238Updated this week
Paulescu / plot-generator-agent
Join 15k builders to the Real-World ML Newsletter ⬇️⬇️⬇️
☆47Updated last year
wjbmattingly / qwen2-vl-finetune-huggingface
This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.
☆77Updated 6 months ago
datalab-to / pdftext
Extract structured text from pdfs quickly
☆661Updated 8 months ago
ai8hyf / TF-ID
TF-ID: Table/Figure IDentifier for academic papers
☆245Updated last year
huggingface / large-scale-image-deduplication
☆188Updated 6 months ago
moured / YOLOv10-Document-Layout-Analysis
YOLOv10 trained on DocLayNet dataset.
☆80Updated last year
kurakurai / Luth
Luth is a state-of-the-art series of fine-tuned LLMs for French
☆41Updated 4 months ago
adithya-s-k / YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆85Updated last year
philschmid / gemini-2.5-ai-engineering-workshop
☆212Updated 8 months ago
di37 / gemma3-270M-tinystories-pytorch
A complete PyTorch implementation of Google's Gemma3 270M language model, featuring sliding window attention, RoPE positional encoding, a…
☆44Updated 5 months ago
AnswerDotAI / byaldi
Use late-interaction multi-modal models such as ColPali in just a few lines of code.
☆843Updated last year
roboflow / vision-ai-checkup
Take your LLM to the optometrist.
☆46Updated last week