InternLM/StarBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/InternLM/StarBench)

InternLM / StarBench

[ICLR 2026] An official implementation of "STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence"

☆39

Alternatives and similar repositories for StarBench

Users that are interested in StarBench are comparing it to the libraries listed below

Sorting:

InternLM / ARC-VL
View on GitHub
[CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"
☆37Nov 26, 2025Updated 3 months ago
InternLM / SIM-CoT
View on GitHub
[ICLR 2026] An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"
☆173Feb 4, 2026Updated 3 weeks ago
InternLM / ARM-Thinker
View on GitHub
[CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"
☆82Feb 13, 2026Updated 2 weeks ago
SYuan03 / MM-IFEngine
View on GitHub
[ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following
☆117Feb 13, 2026Updated 2 weeks ago
Cooperx521 / ScaleCap
View on GitHub
(ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆58Jan 26, 2026Updated last month
zszheng147 / VoiceCraft-X
View on GitHub
☆32Nov 18, 2025Updated 3 months ago
Bujiazi / HiFlow
View on GitHub
[NeurIPS 2025] Official implementation of HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
☆85Sep 18, 2025Updated 5 months ago
InternLM / CapRL
View on GitHub
[ICLR 2026] An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"
☆187Feb 8, 2026Updated 2 weeks ago
furiosa-ai / uncage
View on GitHub
UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation
☆18Aug 12, 2025Updated 6 months ago
the-bird-F / GLM-Voice-RAG
View on GitHub
[EMNLP 2025 Findings] A complete cross-modal RAG system for end-to-end speech-to-speech large models, including ASR-based Retrieval and E…
☆27Jul 11, 2025Updated 7 months ago
alibaba / vstyle
View on GitHub
☆30Sep 15, 2025Updated 5 months ago
Ruiqi-Yan / URO-Bench
View on GitHub
Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
☆50Sep 2, 2025Updated 5 months ago
XiaomiMiMo / MiMo-Audio-Training
View on GitHub
☆97Oct 16, 2025Updated 4 months ago
InternLM / Spatial-SSRL
View on GitHub
[CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"
☆111Dec 27, 2025Updated 2 months ago
Bujiazi / DiCache
View on GitHub
[ICLR 2026] Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache
☆55Jan 26, 2026Updated last month
Shy-98 / MELLE
View on GitHub
Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"
☆41Jun 28, 2025Updated 8 months ago
ddlBoJack / Omni-Captioner
View on GitHub
[ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.
☆119Oct 17, 2025Updated 4 months ago
Wiselnn570 / VideoRoPE
View on GitHub
[ICML 2025 Oral] An official implementation of VideoRoPE & VideoRoPE++
☆218Feb 2, 2026Updated 3 weeks ago
inclusionAI / Ming-UniAudio
View on GitHub
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
☆432Nov 27, 2025Updated 3 months ago
zszheng147 / Spatial-AST
View on GitHub
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆73Feb 13, 2025Updated last year
KlingAIResearch / PhysMaster
View on GitHub
Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning
☆57Oct 16, 2025Updated 4 months ago
YujieOuO / SMIE
View on GitHub
[ACMMM 23] Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization
☆29Nov 4, 2023Updated 2 years ago
YujieOuO / PSTL
View on GitHub
[AAAI 2023(Oral)] Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences
☆27May 14, 2024Updated last year
yanghaha0908 / EmoVoice
View on GitHub
Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"
☆109Oct 16, 2025Updated 4 months ago
meituan-longcat / UNO-Bench
View on GitHub
Omni Model Benchmark with high quality and diversity, which reveals the Compositional Law. We’re now focused on Chinese scenarios — and a…
☆76Jan 12, 2026Updated last month
CS-BAOYAN / CSInternship2025
View on GitHub
☆57Jul 8, 2025Updated 7 months ago
alibaba / unified-audio
View on GitHub
An Open-Source Project to Unify Audio Processing and Generation
☆207Jan 29, 2026Updated 3 weeks ago
CodeGoat24 / Pref-GRPO
View on GitHub
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
☆232Feb 10, 2026Updated 2 weeks ago
JaaackHongggg / WorldSense
View on GitHub
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
☆39Jan 26, 2026Updated last month
X-LANCE / Xmart
View on GitHub
Xmart青年论坛仓库，存放历史学生论坛和前沿讲座的视频回放和讲义，获取最新Xmart预告欢迎关注公众号【XLANCE Lab】
☆50Dec 20, 2025Updated 2 months ago
OPPO-Mente-Lab / X2I
View on GitHub
Official code for ICCV 2025 paper, X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distill…
☆90Jun 26, 2025Updated 8 months ago
v-iashin / Synchformer
View on GitHub
Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)
☆107Sep 15, 2025Updated 5 months ago
jianganbai / FISHER
View on GitHub
A Foundation Model for Industrial Signal Comprehensive Representation
☆57Feb 13, 2026Updated 2 weeks ago
tencent-ailab / MuQ
View on GitHub
Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".
☆310Aug 4, 2025Updated 6 months ago
AIGeeksGroup / DragMesh
View on GitHub
DragMesh: Interactive 3D Generation Made Easy
☆20Dec 28, 2025Updated 2 months ago
OpenMOSS / Embodied-Planner-R1
View on GitHub
Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
☆25Jan 5, 2026Updated last month
FreedomIntelligence / MTalk-Bench
View on GitHub
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
☆16Nov 19, 2025Updated 3 months ago
mastra-ai / template-browsing-agent
View on GitHub
A powerful integration that combines Browserbase's Stagehand with Mastra for advanced web automation, scraping, and AI-powered web intera…
☆33Feb 4, 2026Updated 3 weeks ago
193746 / VHASR
View on GitHub
☆11Oct 31, 2024Updated last year