baichuan-inc / Baichuan-Omni-1.5
View external linksLinks

☆185

Alternatives and similar repositories for Baichuan-Omni-1.5

Users that are interested in Baichuan-Omni-1.5 are comparing it to the libraries listed below

Sorting:

fengzi258 / Ocean-R1
View on GitHub
☆29Mar 12, 2025Updated 11 months ago
VITA-MLLM / VITA
View on GitHub
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
☆2,487Mar 28, 2025Updated 10 months ago
OmniMMI / OmniMMI
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆21Dec 22, 2025Updated last month
baichuan-inc / Baichuan-Audio
View on GitHub
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
☆217Feb 28, 2025Updated 11 months ago
01yzzyu / wikiautogen
View on GitHub
[ICCV2025] WikiAutoGen offical page
☆24Feb 6, 2026Updated last week
westlake-baichuan-mllm / bc-omni
View on GitHub
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
☆272Jan 27, 2025Updated last year
Ola-Omni / Ola
View on GitHub
Ola: Pushing the Frontiers of Omni-Modal Language Model
☆385Jun 13, 2025Updated 8 months ago
JaaackHongggg / WorldSense
View on GitHub
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
☆38Jan 26, 2026Updated 3 weeks ago
HumanMLLM / ViSpeak
View on GitHub
(ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"
☆45Jul 1, 2025Updated 7 months ago
VITA-MLLM / Freeze-Omni
View on GitHub
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
☆365May 27, 2025Updated 8 months ago
OpenBMB / UltraEval-Audio
View on GitHub
Your faithful, impartial partner for audio evaluation — know yourself, know your rivals. 真实评测，知己知彼。
☆275Feb 3, 2026Updated 2 weeks ago
RainBowLuoCS / OpenOmni
View on GitHub
(NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…
☆125Nov 8, 2025Updated 3 months ago
AV-Odyssey / AV-Odyssey
View on GitHub
This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"
☆31Dec 23, 2024Updated last year
TianheL / LM-Implicit-Reasoning
View on GitHub
[ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts
☆17Mar 11, 2025Updated 11 months ago
quicksviewer / quicksviewer
View on GitHub
☆19Jun 29, 2025Updated 7 months ago
threegold116 / Awesome-Omni-MLLMs
View on GitHub
This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels
☆90Jan 3, 2026Updated last month
longvideobench / LongVideoBench
View on GitHub
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆113Jul 27, 2024Updated last year
multimodal-art-projection / OmniBench
View on GitHub
A project for tri-modal LLM benchmarking and instruction tuning.
☆56Mar 27, 2025Updated 10 months ago
Han-Zongbo / Skip-n
View on GitHub
This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.
☆15Feb 12, 2024Updated 2 years ago
QwenLM / Qwen2.5-Omni
View on GitHub
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…
☆3,919Jun 12, 2025Updated 8 months ago
marinero4972 / CyberV
View on GitHub
☆18Jun 10, 2025Updated 8 months ago
NVIDIA / audio-flamingo
View on GitHub
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
☆994Dec 15, 2025Updated 2 months ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated last year
nahidalam / maya
View on GitHub
Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya
☆125Aug 7, 2025Updated 6 months ago
akhilkedia / TranformersGetStable
View on GitHub
[ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"
☆10Jul 19, 2024Updated last year
DAMO-NLP-SG / multimodal_textbook
View on GitHub
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆191Mar 17, 2025Updated 11 months ago
zai-org / GLM-4-Voice
View on GitHub
GLM-4-Voice | 端到端中英语音对话模型
☆3,140Dec 5, 2024Updated last year
CircleRadon / TokenPacker
View on GitHub
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025
☆276May 26, 2025Updated 8 months ago
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆24Sep 26, 2024Updated last year
rohinmanvi / Capability-Aware-and-Mid-Generation-Self-Evaluations
View on GitHub
☆21Jul 25, 2025Updated 6 months ago
OpenMOSS / AnyGPT
View on GitHub
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
☆870Aug 27, 2024Updated last year
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆646Jun 9, 2024Updated last year
WeihuangLin / INF-LLaVA
View on GitHub
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
☆42Aug 4, 2024Updated last year
uclaml / COPS
View on GitHub
The official implementation of Cross-Task Experience Sharing (COPS)
☆29Oct 23, 2024Updated last year
shenao-zhang / SELM
View on GitHub
The official implementation of Self-Exploring Language Models (SELM)
☆63Jun 4, 2024Updated last year
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 5 months ago
smallporridge / TrustworthyRAG
View on GitHub
☆16Sep 17, 2024Updated last year
Gen-Verse / MMaDA
View on GitHub
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
☆1,574Nov 16, 2025Updated 3 months ago
maitrix-org / Voila
View on GitHub
☆486May 6, 2025Updated 9 months ago

baichuan-inc / Baichuan-Omni-1.5View external linksLinks

Alternatives and similar repositories for Baichuan-Omni-1.5

baichuan-inc / Baichuan-Omni-1.5
View external linksLinks