XiaoMi/xiaomi-mimo-vl-miloco

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/XiaoMi/xiaomi-mimo-vl-miloco)

XiaoMi / xiaomi-mimo-vl-miloco

Xiaomi MiMo-VL-Miloco

☆223

Alternatives and similar repositories for xiaomi-mimo-vl-miloco

Users that are interested in xiaomi-mimo-vl-miloco are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xiaomi-research / timeviper
View on GitHub
[CVPR'26] TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding
☆25Jan 4, 2026Updated 6 months ago
xiaomi-research / colar
View on GitHub
[NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
☆97Jun 29, 2026Updated 3 weeks ago
xiaomi-research / q-frame
View on GitHub
[ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"
☆81Oct 25, 2025Updated 8 months ago
Darwin-Agent / awesome-world-models-for-digital-agents
View on GitHub
Digital Agents Meet World Models: A Survey
☆50May 8, 2026Updated 2 months ago
xiaomi-research / mecat
View on GitHub
☆44May 12, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
XiaoMi / subllm
View on GitHub
This repository is the official implementation of the ECAI 2024 conference paper SUBLLM: A Novel Efficient Architecture with Token Sequen…
☆68Aug 13, 2024Updated last year
XenoZLH / Shuffle-R1
View on GitHub
Official code repository of Shuffle-R1
☆26Feb 23, 2026Updated 4 months ago
xiaomi-research / dar
View on GitHub
DAR introduces the diagonal scanning order for next-token prediction and proposes a direction-aware autoregressive transformer framework.
☆19Apr 16, 2025Updated last year
xiaomi-research / svor
View on GitHub
SVOR - Stable Video Object Removal
☆116May 20, 2026Updated 2 months ago
guxu313 / TeViS
View on GitHub
☆21Aug 26, 2025Updated 10 months ago
XiaoMi / dasheng
View on GitHub
Official PyTorch code for Deep Audio-Signal Holistic Embeddings
☆199Nov 7, 2025Updated 8 months ago
xiaomi-research / controlfoley
View on GitHub
[ACM MM 2026] ControlFoley: Unified and Controllable Video-to-Audio Generation with Cross-Modal Conflict Handling
☆142Updated this week
SeerRay-Lab / Xiaomi-GUI-0
View on GitHub
[Technical Report] An End-to-End Multimodal GUI Agent for Real Mobile Environments
☆79Updated this week
pro-assist / ProAssist
View on GitHub
☆20Jul 21, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
xiaomi-research / gemmax
View on GitHub
Gemma-based Multilingual Machine Translation Models
☆51Feb 13, 2026Updated 5 months ago
nasosger / MuToR
View on GitHub
[NeurIPS '25] Multi-Token Prediction Needs Registers
☆30Dec 14, 2025Updated 7 months ago
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated 3 weeks ago
Aman-4-Real / MMTG
View on GitHub
[ACM MM 2022] (Oral): Multi-Modal Experience Inspired AI Creation
☆21Nov 27, 2024Updated last year
XiaomiMiMo / MiMo-V2-Flash
View on GitHub
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation Model
☆1,360Jan 8, 2026Updated 6 months ago
yester31 / Cutlass_EX
View on GitHub
study of cutlass
☆22Nov 10, 2024Updated last year
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
sen-ye / R3
View on GitHub
[ICLR26] Understanding VS. Generation: Navigating Optimization Dilemma in Multimodal Models
☆25May 6, 2026Updated 2 months ago
1ranGuan / VST
View on GitHub
[ECCV 26] Video Streaming Thinking
☆115Jun 18, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
mala-lab / OpenCIL
View on GitHub
Official code for paper "OpenCIL: Benchmarking Out-of-Distribution Detection in Class-Incremental Learning"
☆13Jun 19, 2024Updated 2 years ago
XiaomiMiMo / MiMo
View on GitHub
MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining
☆2,285Jun 5, 2025Updated last year
XiaomiMiMo / MiMo-VL
View on GitHub
MiMo-VL
☆642Aug 21, 2025Updated 11 months ago
chaoqi7 / BSA-CIL-3D
View on GitHub
Boosting the Class-Incremental Learning in 3D Point Clouds via Zero-Collection-Cost Basic Shape Pre-Training
☆13Nov 30, 2024Updated last year
NieeiM / Dasheng-Audiogen
View on GitHub
Generate a complete audio clip with music, intelligible speech, and sound effects from text in one pass.
☆44May 27, 2026Updated last month
astra-vision / StableMTL
View on GitHub
[CVPR 2026] Official repository of "StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synth…
☆18Feb 21, 2026Updated 5 months ago
Adlith / MoE-Jetpack
View on GitHub
[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
☆137Nov 23, 2024Updated last year
yijunshens / StateFactory
View on GitHub
Official implementation of "Reward Prediction with Factorized World States"
☆20Mar 11, 2026Updated 4 months ago
AlbertTan404 / RoLD
View on GitHub
[MMM 2025 Best Paper] RoLD: Robot Latent Diffusion for Multi-Task Policy Modeling
☆24Aug 4, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mmact19 / challenge
View on GitHub
MMAct Challenge
☆13Jun 20, 2021Updated 5 years ago
1ranGuan / thinkomni
View on GitHub
[ICLR26] ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding
☆93Mar 20, 2026Updated 4 months ago
kostassolo / dangers-of-human-touch
View on GitHub
Repository for the defense mechanism of the paper "The Dangers of Human Touch: Fingerprinting Browser Extensions through User Actions"
☆10Feb 10, 2022Updated 4 years ago
MSC-XDU / MSCer_blog_rss
View on GitHub
收集MSC各位成员的博客地址
☆14Jun 26, 2023Updated 3 years ago
threegold116 / Awesome-Omni-MLLMs
View on GitHub
This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels
☆103Mar 22, 2026Updated 4 months ago
MrZilinXiao / ProxyThinker
View on GitHub
[ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.
☆22Sep 24, 2025Updated 9 months ago
XiangTodayEatsWhat / EagleVision
View on GitHub
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
☆26May 29, 2025Updated last year