my-yy / s2v_rcLinks
Speech2Vec Reality Check
☆83Updated 2 years ago
Alternatives and similar repositories for s2v_rc
Users that are interested in s2v_rc are comparing it to the libraries listed below
Sorting:
- ☆134Updated last year
- [NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…☆16Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Updated 2 years ago
- Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)☆190Updated 2 years ago
- A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022☆36Updated last year
- Keras implement of Finite Scalar Quantization☆77Updated last year
- Vector Quantized Autoregressive Predictive Coding (VQ-APC)☆37Updated 4 years ago
- ICLR2023 statistics☆60Updated last year
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Updated 2 years ago
- Contrastively Disentangled Sequential Variational Audoencoder☆46Updated 9 months ago
- Official code for "Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching" (ICML 2022)☆61Updated 2 years ago
- Source code for the paper 'Audio Captioning Transformer'☆54Updated 3 years ago
- Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".☆55Updated 2 years ago
- Can audio-visual integration strengthen robustness under multimodal attacks?☆28Updated 3 years ago
- Code for ICML2020 paper - CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information☆338Updated last year
- This repo contains script to download MUSIC dataset from youtube☆10Updated last year
- ☆41Updated last year
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆29Updated 4 months ago
- [ICCV 2025] SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer☆279Updated 6 months ago
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆49Updated last year
- Representation learning for NLP @ JSALT19☆39Updated 4 years ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆11Updated 2 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆87Updated 9 months ago
- A Pytorch Implementation of Finite Scalar Quantization☆141Updated last year
- Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling☆15Updated last year
- Official PyTorch implementation of SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy M…☆35Updated 10 months ago
- Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)☆72Updated 4 years ago
- Non-Autoregressive Predictive Coding☆51Updated 4 years ago
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆44Updated last year
- Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation☆25Updated 3 years ago