NVIDIA / nv-ingestLinks
NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.
☆2,671Updated last week
Alternatives and similar repositories for nv-ingest
Users that are interested in nv-ingest are comparing it to the libraries listed below
Sorting:
- A system for agentic LLM-powered data processing and ETL☆1,987Updated last week
- Fast State-of-the-Art Static Embeddings☆1,688Updated this week
- Task-Aware Agent-driven Prompt Optimization Framework☆3,265Updated last week
- RAG that intelligently adapts to your use case, data, and queries☆3,280Updated last month
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,182Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆6,449Updated 3 months ago
- Knowledge Agents and Management in the Cloud☆3,984Updated last week
- Open source multi-modal RAG for building AI apps over private knowledge.☆2,385Updated this week
- 📄 🧠 PageIndex: Document Index System for Reasoning-based RAG☆870Updated last week
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the web☆2,195Updated last week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,108Updated last month
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,568Updated this week
- The open LLM Ops platform - Traces, Analytics, Evaluations, Datasets and Prompt Optimization ✨☆1,907Updated this week
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,254Updated last week
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆6,904Updated this week
- AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation☆3,987Updated 3 weeks ago
- ContextGem: Effortless LLM extraction from documents☆1,007Updated this week
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆976Updated this week
- Improved file parsing for LLM’s☆2,977Updated 6 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆906Updated last week
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,587Updated last month
- Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer…☆3,329Updated this week
- An Open Source implementation of Notebook LM with more flexibility and features☆1,688Updated last month
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆5,284Updated this week
- AdalFlow: The library to build & auto-optimize LLM applications.☆3,073Updated 2 months ago
- Build Real-Time Knowledge Graphs for AI Agents☆9,878Updated this week
- High-performance retrieval engine for unstructured data☆1,387Updated this week
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,301Updated this week
- A fast multimodal LLM for real-time voice☆3,968Updated 3 months ago
- The AI-native proxy server for agents. Arch handles the pesky low-level work in building agentic apps like calling specific tools, routin…☆2,641Updated this week