Nutlope / llama-ocr
Document to Markdown OCR library with Llama 3.2 vision
☆2,224Updated 2 months ago
Alternatives and similar repositories for llama-ocr:
Users that are interested in llama-ocr are comparing it to the libraries listed below
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,494Updated last month
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other ent…☆2,604Updated this week
- ☆1,333Updated this week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,633Updated 3 weeks ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆859Updated last month
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆1,995Updated this week
- Local realtime voice AI☆2,256Updated 2 weeks ago
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web wi…☆4,003Updated last week
- 🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library☆2,818Updated this week
- Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) int…☆462Updated 2 weeks ago
- napkins.dev – from screenshot to app☆1,075Updated 2 months ago
- Company Researcher tool helps you instantly understand any company inside out.☆1,127Updated last month
- An Open Source implementation of Notebook LM with more flexibility and features☆1,173Updated last week
- 📲 An agent for sourcing, curating, and scheduling social media posts with human-in-the-loop.☆1,092Updated 2 weeks ago
- Transform PDFs into AI podcasts for engaging on-the-go audio content.☆593Updated 3 weeks ago
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆1,027Updated this week
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing an…☆837Updated 5 months ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,142Updated this week
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,513Updated this week
- Implementing OCR with a local visual model run by ollama.☆258Updated 3 months ago
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,563Updated 3 weeks ago
- An AI personal tutor built with Llama 3.1☆1,814Updated 2 months ago
- 📃 A better UX for chat, writing content, and coding with LLMs.☆4,132Updated last week
- Create apps with Gemini☆754Updated 2 months ago
- Sample apps to help developers get started with Structured Outputs☆620Updated 2 months ago
- Everything about the SmolLM2 and SmolVLM family of models☆2,035Updated last week