getomni-ai / benchmarkLinks
OCR Benchmark
β609Updated 3 months ago
Alternatives and similar repositories for benchmark
Users that are interested in benchmark are comparing it to the libraries listed below
Sorting:
- Structured information extraction from documentsβ318Updated last year
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β352Updated 7 months ago
- RAG evaluation without the need for "golden answers"β333Updated last month
- Lightweight Nearest Neighbors with Flexible Backendsβ330Updated 3 weeks ago
- Simple package to extract text with coordinates from programmatic PDFsβ230Updated last week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β840Updated 11 months ago
- See Through Your Modelsβ400Updated 6 months ago
- Deep Research for your internal dataβ353Updated 7 months ago
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desigβ¦β933Updated 11 months ago
- β178Updated last month
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has stβ¦β1,412Updated 8 months ago
- TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by infβ¦β210Updated last month
- A flexible, adaptive classification system for dynamic text classificationβ522Updated 3 months ago
- A hub for various industry-specific schemas to be used with VLMs.β539Updated last month
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β539Updated 2 months ago
- π Automatically annotate papers using LLMsβ398Updated last month
- Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipelineβ811Updated last week
- Fully neural approach for text chunkingβ406Updated 2 months ago
- OpenAI's Structured Outputs with Logprobsβ200Updated 7 months ago
- π Sycamore is an LLM-powered search and analytics platform for unstructured data.β585Updated this week
- Docling core data types and transformationsβ221Updated this week
- Extract structured text from pdfs quicklyβ651Updated 7 months ago
- Pixelagent β Multimodal stateful agentsβ224Updated 7 months ago
- Parse PDFs into markdown using Vision LLMsβ455Updated 3 months ago
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.β1,276Updated 9 months ago
- Fast Semantic Text Deduplication & Filteringβ866Updated this week
- Things you can do with the token embeddings of an LLMβ1,450Updated last month
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a ligβ¦β227Updated last year
- Generic rag framework to apply the power of LLMs on any given datasetβ660Updated last month
- An example of multi-agent orchestration with llama-indexβ445Updated 11 months ago