Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

☆5,879

Related projects ⓘ

Alternatives and complementary repositories for GOT-OCR2.0

InternLM / MindSearch
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
☆5,076Updated this week
opendatalab / PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
☆5,310Updated 2 weeks ago
VikParuchuri / surya
OCR, layout analysis, reading order, table recognition in 90+ languages
☆13,808Updated this week
opendatalab / MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。
☆13,711Updated this week
adithya-s-k / omniparse
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
☆5,347Updated this week
microsoft / OmniParser
A simple screen parsing tool towards pure vision based GUI agent
☆4,323Updated this week
jina-ai / reader
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
☆6,879Updated last week
getomni-ai / zerox
PDF to Markdown with vision models
☆5,927Updated this week
OpenGVLab / InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
☆5,947Updated last week
OpenBMB / MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
☆12,497Updated 2 weeks ago
CosmosShadow / gptpdf
Using GPT to parse PDF
☆2,997Updated 3 months ago
xorbitsai / inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…
☆5,325Updated this week
QwenLM / Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆2,983Updated last month
Dicklesworthstone / llm_aided_ocr
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
☆2,156Updated 2 months ago
SWivid / F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
☆6,676Updated this week
FunAudioLLM / SenseVoice
Multilingual Voice Understanding Model
☆3,349Updated 3 weeks ago
microsoft / graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
☆18,831Updated this week
THUDM / CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
☆2,100Updated 2 months ago
lipku / LiveTalking
Real time interactive streaming digital human
☆3,827Updated 2 weeks ago
deepseek-ai / DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
☆3,574Updated last month
sugarforever / chat-ollama
ChatOllama is an open source chatbot based on LLMs. It supports a wide range of language models, and knowledge base management.
☆2,643Updated 2 months ago
VikParuchuri / marker
Convert PDF to markdown quickly with high accuracy
☆17,568Updated this week
modelscope / FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…
☆6,853Updated this week
netease-youdao / QAnything
Question and Answer based on Anything.
☆11,805Updated 2 weeks ago
severian42 / GraphRAG-Local-UI
GraphRAG using Local LLMs - Features robust API and multiple apps for Indexing/Prompt Tuning/Query/Chat/Visualizing/Etc. This is meant to…
☆1,695Updated 2 months ago
huggingface / speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
☆3,483Updated last week
DS4SD / docling
Get your documents ready for gen AI
☆7,243Updated this week
bklieger-groq / g1
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
☆3,854Updated last month
mendableai / firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
☆18,297Updated this week
QwenLM / Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
☆9,354Updated this week