NVIDIA / nv-ingestLinks
NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
☆2,733Updated this week
Alternatives and similar repositories for nv-ingest
Users that are interested in nv-ingest are comparing it to the libraries listed below
Sorting:
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,796Updated last week
- A system for agentic LLM-powered data processing and ETL☆2,722Updated last week
- The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.☆3,524Updated last week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,255Updated 4 months ago
- Knowledge Agents and Management in the Cloud☆4,121Updated this week
- Fast State-of-the-Art Static Embeddings☆1,807Updated 2 weeks ago
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,465Updated 3 months ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,388Updated this week
- RAG that intelligently adapts to your use case, data, and queries☆3,468Updated 2 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,164Updated last week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,123Updated 6 months ago
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,193Updated 6 months ago
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,638Updated last month
- High-performance retrieval engine for unstructured data☆1,481Updated last month
- Improved file parsing for LLM’s☆3,042Updated 9 months ago
- Deploy your agentic worfklows to production☆2,053Updated 3 weeks ago
- 🦛 CHONK your texts with Chonkie ✨ — The no-nonsense RAG chunking library☆2,076Updated this week
- Task-Aware Agent-driven Prompt Optimization Framework☆3,510Updated 3 weeks ago
- ETL, Analytics, Versioning for Unstructured Data☆2,623Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,647Updated 3 months ago
- AdalFlow: The library to build & auto-optimize LLM applications.☆3,592Updated last week
- 📄🧠 PageIndex: Document Index for Reasoning-based RAG☆1,281Updated this week
- Developer APIs to Accelerate LLM Projects☆1,711Updated 10 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,335Updated this week
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆1,056Updated this week
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,364Updated 3 months ago
- A fast multimodal LLM for real-time voice☆4,158Updated this week
- The open LLM Ops platform - Traces, Analytics, Evaluations, Datasets and Prompt Optimization ✨☆2,438Updated this week
- Document to Markdown OCR library with Llama 3.2 vision☆2,381Updated 7 months ago
- The Open Source Memory Layer For Autonomous Agents☆2,304Updated 10 months ago