NVIDIA / nv-ingestLinks
NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
☆2,760Updated this week
Alternatives and similar repositories for nv-ingest
Users that are interested in nv-ingest are comparing it to the libraries listed below
Sorting:
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,908Updated last month
- A system for agentic LLM-powered data processing and ETL☆3,065Updated this week
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,455Updated 2 months ago
- Fast State-of-the-Art Static Embeddings☆1,900Updated this week
- Knowledge Agents and Management in the Cloud☆4,204Updated this week
- Build custom inference engines for models, agents, multi-modal systems, RAG, pipelines and more.☆3,690Updated last week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,349Updated 6 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,298Updated 2 weeks ago
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,680Updated 3 weeks ago
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,221Updated 8 months ago
- The open LLM Ops platform - Traces, Analytics, Evaluations, Datasets and Prompt Optimization ✨☆2,605Updated last week
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,515Updated 5 months ago
- Task-Aware Agent-driven Prompt Optimization Framework☆3,674Updated last month
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…☆933Updated 9 months ago
- Detect and extract tables to markdown and csv☆755Updated 9 months ago
- AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation☆4,399Updated this week
- Improved file parsing for LLM’s☆3,135Updated last year
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆1,103Updated last week
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,488Updated this week
- 🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines☆3,203Updated this week
- RAG that intelligently adapts to your use case, data, and queries☆3,583Updated 2 weeks ago
- AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, acc…☆1,339Updated last week
- The most accurate document search and store for building AI apps☆3,363Updated this week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆829Updated 9 months ago
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the web☆2,321Updated 5 months ago
- Things you can do with the token embeddings of an LLM☆1,450Updated 3 weeks ago
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆5,933Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,284Updated 2 months ago
- AdalFlow: The library to build & auto-optimize LLM applications.☆3,873Updated last month
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,432Updated 5 months ago