deepseek-ai / DeepSeek-OCR-2Links
Visual Causal Flow
☆1,306Updated this week
Alternatives and similar repositories for DeepSeek-OCR-2
Users that are interested in DeepSeek-OCR-2 are comparing it to the libraries listed below
Sorting:
- ☆1,495Updated 2 weeks ago
- ☆925Updated last week
- GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning☆2,145Updated this week
- Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models☆3,373Updated 2 weeks ago
- MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B☆1,560Updated 2 weeks ago
- MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation Model☆1,028Updated 3 weeks ago
- ☆1,282Updated last week
- ☆1,222Updated 3 months ago
- A framework for efficient model inference with omni-modality models☆2,491Updated this week
- ☆1,540Updated 2 months ago
- MiMo-VL☆622Updated 5 months ago
- An End-to-End Infrastructure for Training and Evaluating Various LLM Agents☆674Updated this week
- Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.☆732Updated 7 months ago
- MiniMax-M2, a model built for Max coding & agentic workflows.☆2,298Updated 2 months ago
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆1,139Updated 6 months ago
- ☆1,759Updated 4 months ago
- Cook up amazing multimodal AI applications effortlessly with MiniCPM-o☆242Updated last month
- ☆869Updated 3 months ago
- The paper list of "Memory in the Age of AI Agents: A Survey"☆969Updated last week
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,533Updated 7 months ago
- ZeroSearch: Incentivize the Search Capability of LLMs without Searching☆1,236Updated 5 months ago
- OpenCUA: Open Foundations for Computer-Use Agents☆661Updated 2 weeks ago
- AgentEvolver: Towards Efficient Self-Evolving Agent System☆1,109Updated this week
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆875Updated 6 months ago
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆436Updated 2 weeks ago
- ☆1,278Updated 2 months ago
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆588Updated 2 weeks ago
- [AAAI 2026 🔥 Poster] ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning☆320Updated 5 months ago
- Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"☆553Updated 2 months ago
- A minimal yet professional single agent demo project that showcases the core execution pipeline and production-grade features of agents.☆1,381Updated 2 weeks ago