Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.
☆13Sep 19, 2024Updated last year
Alternatives and similar repositories for LLaVA-MOSS2
Users that are interested in LLaVA-MOSS2 are comparing it to the libraries listed below
Sorting:
- VehicleWorld is the first comprehensive multi-device environment for intelligent vehicle interaction that accurately models the complex, …☆21Sep 16, 2025Updated 5 months ago
- Vision Transformer (ViT) models, with their attention mechanisms, revolutionized computer vision. By merging Class Activation Map (CAM) a…☆13Aug 14, 2023Updated 2 years ago
- ☆11Jan 12, 2023Updated 3 years ago
- [ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".☆13Jan 25, 2025Updated last year
- ☆12Mar 5, 2024Updated last year
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- ☆13Jul 28, 2024Updated last year
- ☆14May 20, 2025Updated 9 months ago
- dMel: Speech Tokenization Made Simple☆16May 13, 2025Updated 9 months ago
- ☆11Aug 23, 2020Updated 5 years ago
- ☆20Nov 21, 2025Updated 3 months ago
- Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)☆20Jan 17, 2026Updated last month
- Official implementation of "NoiseAR: AutoRegressing Initial Noise Prior for Diffusion Models"☆18Jun 3, 2025Updated 9 months ago
- [🔥ACM MM2025] EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation☆23Dec 30, 2025Updated 2 months ago
- Our first tutorial, make your own Augmented Reality app.☆12Jul 25, 2024Updated last year
- Mind map for the course on Andrew Ng Machine Learning and popular platforms and libs for AI.☆11Dec 1, 2023Updated 2 years ago
- ☆15Jan 22, 2024Updated 2 years ago
- [AAAI 2026] This repository is the official implementation of "ReAlign: Text-to-Motion Generation via Step-Aware Reward-Guided Alignment"…☆27Feb 12, 2026Updated 2 weeks ago
- [NCA] Official implementation of the paper Motion2Language, Unsupervised learning of synchronized semantic motion segmentation☆13Sep 9, 2024Updated last year
- Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language…☆14Dec 16, 2024Updated last year
- ☆13Oct 9, 2024Updated last year
- ☆14Oct 17, 2023Updated 2 years ago
- MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models☆58Jul 24, 2025Updated 7 months ago
- A Decade of Action Quality Assessment: Largest Systematic Survey of Trends, Challenges, and Future Directions☆15Jan 22, 2026Updated last month
- Official implementation of "PAPR in Motion: Seamless Point-level 3D Scene Interpolation"☆13Nov 6, 2024Updated last year
- Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence Learning (CVPR 2025, pytorch co…☆14Sep 29, 2025Updated 5 months ago
- Fine-grained Figure Skating dataset (FineFS) involves RGB videos and estimated skeleton data, providing rich annotations for multiple dow…☆18Sep 15, 2024Updated last year
- A modular implementation of product of experts VAEs for multimodal data☆13Nov 15, 2021Updated 4 years ago
- ☆13Nov 20, 2023Updated 2 years ago
- [ICLR'25] The first benchmark aiming to evaluate whether LMMs can assist oracle bone inscription processing tasks☆22Mar 21, 2025Updated 11 months ago
- Repo for paper "MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding".☆39Jun 9, 2025Updated 8 months ago
- ☆11Jul 18, 2021Updated 4 years ago
- Author's implementation of learning virtual chimeras by dynamic motion reassembly (SIGGRAPH Asia 2022 Technical Paper)☆14Feb 20, 2023Updated 3 years ago
- Source code for Findings of EMNLP 2021 paper ``Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning``☆13Nov 9, 2021Updated 4 years ago
- [ACL 2025 🔥] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts☆18May 22, 2025Updated 9 months ago
- Build a bridge that connects beginners to deep reinforcement learning.☆11Sep 23, 2024Updated last year
- Preprocessed data of SignDiff: Learning Diffusion Models for American Sign Language Production☆17May 1, 2025Updated 10 months ago
- 保存(原)东京工业大学IGP群的资料☆15Oct 10, 2024Updated last year