adithya-s-k / omniparse
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
☆6,470Updated this week
Alternatives and similar repositories for omniparse:
Users that are interested in omniparse are comparing it to the libraries listed below
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆7,310Updated 3 months ago
- This project is a template for my help my pupils from SENAC build your own project Html5☆11Updated 9 months ago
- KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning a…☆6,348Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆24,532Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆8,548Updated this week
- The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.☆3,367Updated this week
- 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.☆14,802Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,449Updated 2 months ago
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆30,300Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆24,025Updated last week
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,295Updated 3 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,105Updated this week
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆10,861Updated last week
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆6,530Updated this week
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,508Updated 2 weeks ago
- Python scraper based on AI☆19,047Updated this week
- Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…☆7,461Updated this week
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆5,424Updated this week
- A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.☆7,641Updated this week
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆34,783Updated this week
- A simple, easy-to-hack GraphRAG implementation☆2,775Updated this week
- OCR & Document Extraction using vision models☆10,877Updated 2 weeks ago
- Task-Aware Agent-driven Prompt Optimization Framework☆3,142Updated 3 weeks ago
- 📃 A better UX for chat, writing content, and coding with LLMs.☆4,342Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆5,958Updated last month
- "LightRAG: Simple and Fast Retrieval-Augmented Generation"☆14,837Updated this week
- Neo4j graph construction from unstructured data using LLMs☆3,292Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆11,088Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,016Updated 3 weeks ago
- Knowledge Agents and Management in the Cloud☆3,875Updated this week