NVIDIA / nv-ingestLinks
NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
☆2,840Updated this week
Alternatives and similar repositories for nv-ingest
Users that are interested in nv-ingest are comparing it to the libraries listed below
Sorting:
- Knowledge Agents and Management in the Cloud☆4,231Updated last week
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,935Updated 4 months ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,483Updated 5 months ago
- A system for agentic LLM-powered data processing and ETL☆3,525Updated last week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,448Updated 9 months ago
- A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.☆3,797Updated last week
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆1,140Updated this week
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,733Updated last month
- Document to Markdown OCR library with Llama 3.2 vision☆2,424Updated last year
- Fast State-of-the-Art Static Embeddings☆1,992Updated last month
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,503Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,275Updated 11 months ago
- ☆2,112Updated 10 months ago
- Task-Aware Agent-driven Prompt Optimization Framework☆3,753Updated 3 months ago
- Improved file parsing for LLM’s☆3,152Updated last year
- AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, acc…☆1,545Updated last week
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,973Updated 2 months ago
- High-performance retrieval engine for unstructured data☆1,553Updated 2 months ago
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆6,095Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,314Updated 2 months ago
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,855Updated 2 weeks ago
- RAG that intelligently adapts to your use case, data, and queries☆3,687Updated 3 months ago
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,464Updated 8 months ago
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,525Updated 8 months ago
- Official Implementation of "KBLaM: Knowledge Base augmented Language Model"☆1,436Updated 3 months ago
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆7,647Updated 3 months ago
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…☆933Updated last year
- Empowering RAG with a memory-based data interface for all-purpose applications!☆2,208Updated 4 months ago
- Detect and extract tables to markdown and csv☆754Updated last year
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆842Updated last year