infinigence/Infini-Megrez

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/infinigence/Infini-Megrez)

infinigence / Infini-Megrez

☆338

Alternatives and similar repositories for Infini-Megrez

Users that are interested in Infini-Megrez are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

infinigence / Infini-Megrez-Omni
View on GitHub
☆243Feb 21, 2025Updated last year
WePOINTS / WePOINTS
View on GitHub
☆189Mar 13, 2026Updated 4 months ago
VITA-MLLM / VITA
View on GitHub
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
☆2,520Mar 28, 2025Updated last year
infinigence / LVEval
View on GitHub
Repository of LV-Eval Benchmark
☆78Aug 31, 2024Updated last year
zai-org / GLM-4-Voice
View on GitHub
GLM-4-Voice | 端到端中英语音对话模型
☆3,206Dec 5, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
infinigence / Semi-PD
View on GitHub
A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.
☆127Dec 25, 2025Updated 6 months ago
fuvty / DeSCo
View on GitHub
[WSDM'24 Oral] The official implementation of paper <DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting>
☆24Mar 11, 2024Updated 2 years ago
OpenBMB / MiniCPM
View on GitHub
MiniCPM5-1B: A SOTA 1B on-device LLM, small yet powerful.
☆9,967Jun 20, 2026Updated last month
infinigence / SpecEE
View on GitHub
Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)
☆75Apr 25, 2025Updated last year
OmniSteward / OmniSteward
View on GitHub
🐼基于LLM Agent的全能管家，通过语音或文字交互，调用工具控制智能家居(HomeAssistant/米家)和电脑。超高拓展性，无限可能。
☆112Dec 25, 2024Updated last year
OpenGVLab / InternVL
View on GitHub
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
☆10,098Sep 22, 2025Updated 9 months ago
OpenBMB / MiniCPM-V
View on GitHub
A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone
☆25,948Jun 25, 2026Updated 3 weeks ago
baaivision / Emu3
View on GitHub
Next-Token Prediction is All You Need
☆2,432Jan 12, 2026Updated 6 months ago
gpt-omni / mini-omni
View on GitHub
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…
☆3,562Nov 5, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zai-org / CogVLM2
View on GitHub
GPT4V-level open-source multi-modal model based on Llama3-8B
☆2,436Mar 3, 2025Updated last year
QwenLM / Qwen2-Audio
View on GitHub
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
☆2,088Apr 21, 2025Updated last year
zai-org / CogView4
View on GitHub
CogView4, CogView3-Plus and CogView3(ECCV 2024)
☆1,101Mar 29, 2025Updated last year
xverse-ai / XVERSE-V-13B
View on GitHub
☆78May 6, 2024Updated 2 years ago
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,154Feb 10, 2025Updated last year
PRIME-RL / PRIME
View on GitHub
Scalable RL solution for advanced reasoning of language models
☆1,865Mar 18, 2025Updated last year
hlchen23 / ADPN-MM
View on GitHub
Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…
☆51Dec 30, 2023Updated 2 years ago
inclusionAI / Ming
View on GitHub
Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.
☆664Mar 17, 2026Updated 4 months ago
VITA-MLLM / Freeze-Omni
View on GitHub
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
☆388May 27, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jdh-algo / JoyHallo
View on GitHub
JoyHallo: Digital human model for Mandarin
☆519Sep 21, 2025Updated 10 months ago
infinigence / FUSCO
View on GitHub
High-performance distributed data shuffling (all-to-all) library for MoE training and inference
☆123Mar 7, 2026Updated 4 months ago
bytedance / Valley
View on GitHub
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, video, and audio data.
☆287May 8, 2026Updated 2 months ago
TencentARC / BrushEdit
View on GitHub
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
☆587Sep 3, 2025Updated 10 months ago
Vision-CAIR / LongVU
View on GitHub
[ICML 2025] Official PyTorch implementation of LongVU
☆429May 8, 2025Updated last year
FunAudioLLM / CosyVoice
View on GitHub
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆22,292May 25, 2026Updated last month
Alibaba-NLP / OmniSearch
View on GitHub
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
☆429Apr 22, 2025Updated last year
X-PLUG / Multi-LLM-Agent
View on GitHub
☆242Apr 23, 2024Updated 2 years ago
thu-nics / qllm-eval
View on GitHub
Code Repository of Evaluating Quantized Large Language Models
☆135Sep 8, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Aria-Zhangjl / StoryWeaver
View on GitHub
[AAAI 2025] StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization
☆227Updated this week
jefferyZhan / GThinker
View on GitHub
[CVPR 2026] GThinker, Reasoning MLLM, Visual Cues, Visual Rethinking
☆18Mar 9, 2026Updated 4 months ago
QwenLM / Qwen2.5-Omni
View on GitHub
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…
☆4,039Jun 12, 2025Updated last year
UITron-hub / UITron-Speech
View on GitHub
☆21Jan 22, 2026Updated 5 months ago
xverse-ai / XVERSE-MoE-A36B
View on GitHub
XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.
☆37Sep 12, 2024Updated last year
zai-org / GLM-V
View on GitHub
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
☆2,356Updated this week
InternLM / InternLM-XComposer
View on GitHub
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
☆2,921May 26, 2025Updated last year