getomni-ai / benchmarkLinks
OCR Benchmark
☆511Updated 3 weeks ago
Alternatives and similar repositories for benchmark
Users that are interested in benchmark are comparing it to the libraries listed below
Sorting:
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,138Updated last month
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆797Updated 4 months ago
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…☆920Updated 4 months ago
- Structured information extraction from documents☆315Updated 8 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆313Updated 3 weeks ago
- Make any LLM to think like OpenAI o1 and deepseek R1☆490Updated 4 months ago
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆221Updated 6 months ago
- Extract structured text from pdfs quickly☆497Updated 2 weeks ago
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆1,015Updated last week
- Detect and extract tables to markdown and csv☆749Updated 5 months ago
- See Through Your Models☆394Updated 3 months ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,284Updated 2 weeks ago
- Deep Research for your internal data☆327Updated 2 weeks ago
- A hub for various industry-specific schemas to be used with VLMs.☆518Updated 3 weeks ago
- 📄 🧠 PageIndex: Document Index System for Reasoning-based RAG☆1,066Updated last week
- OpenAI's Structured Outputs with Logprobs☆172Updated 3 weeks ago
- Generic rag framework to apply the power of LLMs on any given dataset☆628Updated last week
- Dabbling with ReAct chatbots☆204Updated 10 months ago
- Fully neural approach for text chunking☆357Updated last month
- Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) int…☆622Updated 3 months ago
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing an…☆858Updated 9 months ago
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)☆653Updated last month
- Lightweight Nearest Neighbors with Flexible Backends☆290Updated 3 weeks ago
- A flexible, adaptive classification system for dynamic text classification☆240Updated this week
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Framework☆340Updated 6 months ago
- TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by inf…☆189Updated 3 weeks ago
- Browser-LLM Auto-Scaling Technology☆524Updated this week
- A cache for AI agents to learn and replay complex behaviors.☆670Updated last week
- Things you can do with the token embeddings of an LLM☆1,445Updated 2 months ago
- Fast Semantic Text Deduplication & Filtering☆738Updated 3 weeks ago