[TMLR 2022] High-Modality Multimodal Transformer
☆117Nov 2, 2024Updated last year
Alternatives and similar repositories for HighMMT
Users that are interested in HighMMT are comparing it to the libraries listed below
Sorting:
- ☆10Aug 20, 2024Updated last year
- [NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning☆618Jan 27, 2024Updated 2 years ago
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆39Sep 6, 2021Updated 4 years ago
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆99Aug 22, 2024Updated last year
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆116Nov 30, 2022Updated 3 years ago
- Holistic evaluation of multimodal foundation models☆49Aug 11, 2024Updated last year
- [ICML 2023] Provable Dynamic Fusion for Low-Quality Multimodal Data☆119Jun 28, 2025Updated 8 months ago
- Intepretability method to find what navigation agents learn☆19Jun 16, 2022Updated 3 years ago
- The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)☆310Sep 22, 2025Updated 6 months ago
- Source materials for CoinFT☆29Jan 23, 2026Updated last month
- Official code for the paper: "Metadata Archaeology"☆19May 10, 2023Updated 2 years ago
- Experiment for Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning☆26Jan 16, 2023Updated 3 years ago
- Official implementation of the paper The Hidden Language of Diffusion Models☆78Jan 24, 2024Updated 2 years ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆73Nov 13, 2023Updated 2 years ago
- Variational Reinforcement Learning☆17Jul 25, 2024Updated last year
- [Preprint 2022] “Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?” by Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zh…☆63Jan 18, 2023Updated 3 years ago
- ☆14May 31, 2022Updated 3 years ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- (CVPR Workshop Best Paper Award) Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustn…☆17Nov 4, 2025Updated 4 months ago
- This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.☆15Feb 12, 2024Updated 2 years ago
- This is an implementation of the paper "Are We Done with Object-Centric Learning?"☆12Sep 11, 2025Updated 6 months ago
- ☆54Dec 30, 2024Updated last year
- Improved diffusion generative models with subspaces☆135Jun 1, 2022Updated 3 years ago
- Implementation for NATv2.☆23Feb 20, 2021Updated 5 years ago
- Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances☆12Aug 14, 2022Updated 3 years ago
- Repository for the PopulAtion Parameter Averaging (PAPA) paper☆31Apr 11, 2024Updated last year
- AgentHive provides the primitives and helpers for a seamless usage of robohive within TorchRL.☆35Jan 12, 2024Updated 2 years ago
- Source code for the paper "Policy Architectures for Compositional Generalization in Control"☆30May 19, 2022Updated 3 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆42Dec 23, 2023Updated 2 years ago
- Interactively evolve various types of art (pictures, animations, shapes, and sounds) using Compositional Pattern Producing Networks☆21May 9, 2018Updated 7 years ago
- Code for Point-Calibrated Spectral Neural Operators☆20Oct 15, 2024Updated last year
- [AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity☆25Jul 11, 2023Updated 2 years ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- Beta-VAE, Conditional-VAE, Total Correlation-VAE, FactorVAE, Relevance Factor-VAE, Multi-Level VAE, (Soft)-IntroVAE (Beta-Version), LVAE,…☆17Aug 19, 2025Updated 7 months ago
- This is the official code for NeurIPS 2023 paper "Learning Unseen Modality Interaction"☆18Jan 22, 2024Updated 2 years ago
- ☆14Oct 24, 2025Updated 4 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32May 15, 2023Updated 2 years ago