NMS05 / Multimodal-Fusion-with-Attention-BottlenecksView external linksLinks
☆39Nov 22, 2024Updated last year
Alternatives and similar repositories for Multimodal-Fusion-with-Attention-Bottlenecks
Users that are interested in Multimodal-Fusion-with-Attention-Bottlenecks are comparing it to the libraries listed below
Sorting:
- Deep Variational Information Bottleneck (DVIB) in PyTorch.☆10Apr 25, 2020Updated 5 years ago
- ☆17Jan 1, 2024Updated 2 years ago
- PyTorch implementation of "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scorin…☆20Apr 3, 2024Updated last year
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆22Apr 27, 2024Updated last year
- PyTorch implementation of the models described in the IEEE ICASSP 2022 paper "Is cross-attention preferable to self-attention for multi-m…☆63Mar 29, 2025Updated 10 months ago
- This is an end to end mlops capstone project for educational purpose.☆22Mar 7, 2025Updated 11 months ago
- Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.☆28May 1, 2024Updated last year
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆35Jun 20, 2023Updated 2 years ago
- PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)☆20Apr 11, 2022Updated 3 years ago
- [TAFFC 2024] The official implementation of paper: From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Rec…☆93Oct 28, 2025Updated 3 months ago
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- [KDD 2026 ADS Track] Pytorch implementation of the paper "Hi-Guard: Towards Trustworthy Multimodal Moderation via Policy-Aligned Reasonin…☆19Jan 13, 2026Updated last month
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- Anki add-on that adds Pinyin and Zhuyin readings above Chinese characters in any field.☆12Sep 23, 2025Updated 4 months ago
- ☆11Oct 29, 2024Updated last year
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 2 months ago
- Companion code for Awe the Audience: How the Narrative Trajectories Affect Audience Perception in Public Speaking☆14Jan 6, 2018Updated 8 years ago
- Official repository for ACM Multimedia'24 paper "MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube a…☆17Aug 11, 2024Updated last year
- Dataset and pre-trained model of EMNLP-IJCNLP 2019 paper "TalkDown: A Corpus for Condescension Detection in Context."☆10Jan 26, 2020Updated 6 years ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated 11 months ago
- ☆11Sep 1, 2024Updated last year
- ☆11Jan 29, 2023Updated 3 years ago
- ☆14Dec 7, 2025Updated 2 months ago
- Methods to extract tracks from time-frequency distributions; tracks can represent instantaneous frequency (IF) laws☆10May 11, 2016Updated 9 years ago
- [PRCV-2023, IEEE TMM-2025] Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion based Classification☆12Dec 20, 2025Updated last month
- ☆10May 6, 2024Updated last year
- Arctic sea ice interannual variability and change☆11Mar 26, 2018Updated 7 years ago
- With the advent of Industry 4.0, manufacturing industries are competing to adopt intelligent machining systems into their processes to ge…☆13May 22, 2020Updated 5 years ago
- About PyTorch implementation for ‘’Robust Multi-View Clustering with Noisy Correspondence‘’ (TKDE 2024)☆11Aug 2, 2024Updated last year
- [NeurIPS 2024] Unsupervised Hierarchy-Agnostic Segmentation: Parsing Semantic Image Structure☆10Nov 27, 2025Updated 2 months ago
- EmoCapCLIP: Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions☆20Jul 29, 2025Updated 6 months ago
- Official implementation of DGP-based multi-speaker speech synthesis with PyTorch☆24Mar 23, 2021Updated 4 years ago
- Using CNN for classifying 101 different food categories - using VGG16, Alex Net and SVM☆10Jan 6, 2020Updated 6 years ago
- ☆11Jun 2, 2022Updated 3 years ago
- Unofficial Pytorch Lightning Implementation of "Towards Robust Speech Super-Resolution"☆10May 8, 2023Updated 2 years ago
- Comparing performance of different InfoNCE type losses used in contrastive learning.☆14Jun 12, 2024Updated last year
- The Sea Ice Evaluation Tool (SITool) is a performance metrics and diagnostics tool developed to evaluate the model skills in simulating t…☆12May 17, 2023Updated 2 years ago
- ☆10Nov 28, 2018Updated 7 years ago
- DeepMMSE: A Deep Learning Approach to MMSE-based Noise Power Spectral Density Estimation☆11Jun 4, 2020Updated 5 years ago