MOSS-VL is the core multimodal model series within the OpenMOSS ecosystem, dedicated to visual understanding.
☆259Jun 1, 2026Updated 2 weeks ago
Alternatives and similar repositories for MOSS-VL
Users that are interested in MOSS-VL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A minimal, educational HEVC (H.265) encoder written in Python.☆53Feb 23, 2026Updated 3 months ago
- Code for "Exponential Family Estimation via Adversarial Dynamics Embedding" (NeurIPS 2019)☆14Nov 26, 2019Updated 6 years ago
- [ICLR 2025 Spotlight] Weak-to-strong preference optimization: stealing reward from weak aligned model☆18Feb 24, 2025Updated last year
- ☆20Jul 5, 2024Updated last year
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆60Aug 15, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆21Jan 25, 2025Updated last year
- ☆95Oct 21, 2025Updated 7 months ago
- ☆50Jun 4, 2026Updated last week
- Explaining audio differences using language☆16Feb 11, 2025Updated last year
- ☆13Mar 23, 2026Updated 2 months ago
- OpenMMLab Detection Toolbox and Benchmark☆11Aug 1, 2023Updated 2 years ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 7 months ago
- ☆14Jun 17, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The official repo for the DanQing dataset.☆36Mar 25, 2026Updated 2 months ago
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]☆16Jul 15, 2025Updated 10 months ago
- The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…☆29Nov 18, 2025Updated 6 months ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation☆20Jun 2, 2025Updated last year
- Official repository for the UAE paper, unified-GRPO, and unified-Bench☆165Sep 12, 2025Updated 9 months ago
- Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation☆16Mar 28, 2026Updated 2 months ago
- Convert pdf to pages of images☆13Apr 18, 2020Updated 6 years ago
- ☆79May 4, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆13Jul 14, 2024Updated last year
- Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"☆17May 19, 2023Updated 3 years ago
- Understanding Self-Supervised Learning in a non-IID Setting☆21Oct 21, 2022Updated 3 years ago
- movenet cpp deploy; model transformed from tensorflow☆14Nov 17, 2021Updated 4 years ago
- The first Interleaved framework for textual reasoning within the visual generation process☆162Mar 16, 2026Updated 2 months ago
- unity sdk for rendering, tracking, input, interaction, mixed reality, platform services☆17Dec 18, 2025Updated 5 months ago
- ☆14May 17, 2022Updated 4 years ago
- ☆11Nov 15, 2016Updated 9 years ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆22Oct 8, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"☆25Mar 30, 2026Updated 2 months ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆23Jul 10, 2024Updated last year
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆18Nov 19, 2024Updated last year
- Collaborative Training of Large Language Models in an Efficient Way☆420Aug 28, 2024Updated last year
- Very deep VAEs in JAX/Flax☆47Jun 16, 2021Updated 4 years ago
- Stochastic trace estimation using JAX☆18Aug 20, 2025Updated 9 months ago
- An interactive cli tool that uses PHP PDO.☆14Dec 17, 2023Updated 2 years ago