NVIDIA / nv-ingestLinks
NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.
☆2,685Updated this week
Alternatives and similar repositories for nv-ingest
Users that are interested in nv-ingest are comparing it to the libraries listed below
Sorting:
- A system for agentic LLM-powered data processing and ETL☆2,223Updated this week
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,206Updated last week
- The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.☆3,298Updated this week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,136Updated last month
- RAG that intelligently adapts to your use case, data, and queries☆3,320Updated 2 months ago
- ContextGem: Effortless LLM extraction from documents☆1,180Updated this week
- Fast State-of-the-Art Static Embeddings☆1,732Updated 2 weeks ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,281Updated last week
- The open LLM Ops platform - Traces, Analytics, Evaluations, Datasets and Prompt Optimization ✨☆2,064Updated this week
- Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer…☆3,428Updated this week
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…☆921Updated 4 months ago
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the web☆2,274Updated last week
- Open source multi-modal RAG for building AI apps over private knowledge.☆2,662Updated this week
- The python library for real-time communication☆4,037Updated last week
- AdalFlow: The library to build & auto-optimize LLM applications.☆3,328Updated 2 months ago
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,429Updated last month
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,586Updated 3 weeks ago
- LOTUS: A semantic query engine for fast and easy LLM-powered data processing☆1,199Updated 3 weeks ago
- 🦛 CHONK your texts with Chonkie ✨ — The no-nonsense RAG chunking library☆1,478Updated last week
- Easy token price estimates for 400+ LLMs. TokenOps.☆1,716Updated this week
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆1,964Updated this week
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,614Updated last month
- The AI-native proxy server for agents. Arch handles the pesky low-level work in building agentic apps like calling specific tools, routin…☆2,726Updated last week
- Task-Aware Agent-driven Prompt Optimization Framework☆3,331Updated 2 weeks ago
- Memory for AI Agents in 5 lines of code☆5,533Updated last week
- A Kubernetes deployable instance of GroundX for document parsing, storage, and search.☆758Updated this week
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆1,015Updated this week
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,148Updated this week
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆950Updated last week
- A fast multimodal LLM for real-time voice☆4,016Updated 4 months ago