bytedance / DolphinLinks
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
☆7,680Updated last week
Alternatives and similar repositories for Dolphin
Users that are interested in Dolphin are comparing it to the libraries listed below
Sorting:
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆5,525Updated this week
 - "RAG-Anything: All-in-One RAG Framework"☆9,710Updated 2 weeks ago
 - ContextGem: Effortless LLM extraction from documents☆1,692Updated last month
 - 📄🧠 PageIndex: Document Index for Reasoning-based RAG☆2,911Updated 2 weeks ago
 - Python library for Agentic Document Extraction from LandingAI☆2,137Updated last week
 - Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆7,716Updated last week
 - An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,793Updated 2 months ago
 - A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive vi…☆16,708Updated this week
 - 100+ Fine-tuning Tutorial Notebooks on Google Colab, Kaggle and more.☆3,778Updated last week
 - ☆9,405Updated 2 months ago
 - 🦛 CHONK docs with Chonkie ✨ — The no-nonsense RAG library☆3,051Updated last week
 - Eigent: The World's First Multi-agent Workforce to Unlock Your Exceptional Productivity.☆2,365Updated this week
 - LLM agents built for control. Designed for real-world use. Deployed in minutes.☆15,652Updated this week
 - AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/B…☆2,367Updated last week
 - ☆2,054Updated 7 months ago
 - The most accurate document search and store for building AI apps☆3,339Updated this week
 - Contexts Optical Compression☆18,351Updated last week
 - RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal…☆3,461Updated this week
 - "AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"☆7,697Updated 2 weeks ago
 - ☆2,041Updated this week
 - Toolkit for linearizing PDFs for LLM datasets/training☆14,689Updated last week
 - The absolute trainer to light up AI agents.☆5,815Updated this week
 - Multi-Language Backend Framework that unifies APIs, background jobs, workflows, and AI Agents into a single core primitive with built-in …☆9,753Updated last week
 - Recursive-Open-Meta-Agent v0.1 (Beta). A meta-agent framework to build high-performance multi-agent systems.☆4,425Updated last week
 - Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and se…☆3,994Updated last week
 - Connect any AI model to 600+ integrations; powered by MCP 📡 🚀☆3,072Updated this week
 - Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,903Updated last month
 - Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your …☆4,475Updated this week
 - OCR model that handles complex tables, forms, handwriting with full layout.☆278Updated this week
 - OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex lay…☆2,350Updated 3 months ago