Mixture-AI / meta-llama-explainLinks
Explanation of the llama2 repo.
☆11Updated last year
Alternatives and similar repositories for meta-llama-explain
Users that are interested in meta-llama-explain are comparing it to the libraries listed below
Sorting:
- Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.☆13Updated last year
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆474Updated last year
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆87Updated 3 weeks ago
- Official repository of MMDU dataset☆103Updated last year
- Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"☆77Updated 11 months ago
- ☆58Updated 6 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆248Updated 3 months ago
- MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning☆138Updated 3 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆347Updated 3 weeks ago
- [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding☆95Updated 10 months ago
- [ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.☆532Updated 3 weeks ago
- A paper list of Awesome Latent Space.☆319Updated this week
- Official github repo of G-LLaVA☆148Updated 11 months ago
- ☆153Updated 8 months ago
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆137Updated 5 months ago
- ☆31Updated 5 months ago
- [ICLR 2025 Spotlight] The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multi…☆174Updated 10 months ago
- The Next Step Forward in Multimodal LLM Alignment☆196Updated 9 months ago
- [NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆90Updated 6 months ago
- A RLHF Infrastructure for Vision-Language Models☆193Updated last year
- (ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆58Updated this week
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆140Updated 3 weeks ago
- ☆156Updated 11 months ago
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"☆203Updated last year
- Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)☆76Updated 10 months ago
- MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka☆322Updated 7 months ago
- Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"☆394Updated 4 months ago
- 对llava官方代码的一些学习笔记☆29Updated last year
- [ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…☆109Updated 6 months ago
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆430Updated 5 months ago