huckiyang/Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huckiyang/Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial)

huckiyang / Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial

Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling

☆15

Alternatives and similar repositories for Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial

Users that are interested in Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

etzinis / optimal_condition_training
View on GitHub
Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…
☆14Feb 15, 2023Updated 3 years ago
zengchang233 / CrossSinger
View on GitHub
The source code for the paper CrossSinger (asru2023)
☆18Oct 12, 2023Updated 2 years ago
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
LiChenda / Multi-clue-TSE-data
View on GitHub
Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"
☆17May 19, 2023Updated 3 years ago
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
chaufanglin / Normal2Whisper
View on GitHub
Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"
☆14Oct 31, 2024Updated last year
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
david-gimeno / tailored-avsr
View on GitHub
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
☆14Feb 24, 2025Updated last year
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
soumimaiti / speechlmscore_tool
View on GitHub
☆34Nov 24, 2024Updated last year
wenet-e2e / wecut
View on GitHub
video cut powered by AI
☆23Nov 15, 2022Updated 3 years ago
stoneMo / MGN
View on GitHub
Official implementation for MGN
☆20Dec 22, 2022Updated 3 years ago
xinshengwang / robpitch
View on GitHub
A pitch detection model trained to be robust against noise and reverberation environments.
☆27Jan 21, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
dbdydgur2244 / vim-setting
View on GitHub
비임 가즈으아
☆11Aug 16, 2018Updated 7 years ago
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
JonathanDZ / TF-FaSNet
View on GitHub
☆24Feb 28, 2023Updated 3 years ago
slp-rl / SLM-Discrete-Representations
View on GitHub
This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…
☆20Jan 3, 2023Updated 3 years ago
JusperLee / Look2hear
View on GitHub
A toolkit for researchers in the multimodal sound separation.
☆16Oct 20, 2023Updated 2 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
NVlabs / MCPNet
View on GitHub
[CVPR 2024] Official Repository for MCPNet: An Interpretable Classifier via Multi-Level Concept Prototypes
☆20Jul 8, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
jmandel / fun-with-formants
View on GitHub
Speech formant tracking code in Python
☆15Oct 10, 2013Updated 12 years ago
ahaliassos / usr2
View on GitHub
PyTorch implementation of USR 2.0 (ICLR 2026)
☆15Apr 3, 2026Updated 3 months ago
vectominist / spin
View on GitHub
Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…
☆65May 19, 2023Updated 3 years ago
tts-tutorial / icassp2022
View on GitHub
☆64May 23, 2022Updated 4 years ago
mispchallenge / MISP-ICME-AVSR
View on GitHub
☆17Jan 1, 2024Updated 2 years ago
gustavo-beck / wavebender-gan
View on GitHub
☆25Sep 27, 2022Updated 3 years ago
voidful / vall-e-encodec
View on GitHub
☆41May 15, 2023Updated 3 years ago
sungnyun / cav2vec
View on GitHub
(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
☆16Apr 29, 2025Updated last year
Joshua-1995 / LearnableUpsamplingLayer-Pytorch
View on GitHub
Pytorch implementation of LearnableUpsamplingLayer (NaturalSpeech, Tan et al., 2022)
☆57Mar 12, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
KeiKinn / ParaCLAP
View on GitHub
Towards a general language-audio model for computational paralinguistic tasks
☆30Dec 14, 2024Updated last year
pprablanc / ppsrt
View on GitHub
A python algorithm to change the pitch of the voice in real time
☆13Dec 13, 2020Updated 5 years ago
JaesungHuh / VoxMovies
View on GitHub
Evaluation script for VoxMovies dataset in PyTorch
☆23Jan 12, 2024Updated 2 years ago
michimoeller / liftingLayers
View on GitHub
☆12Dec 8, 2022Updated 3 years ago
qiuqiangkong / mini_llm
View on GitHub
☆29Jul 4, 2025Updated last year
jim-meyer / lottery_ticket_pruner
View on GitHub
(Personal project) Pruning algorithm for DNNs using "lottery ticket" pruning
☆10Dec 8, 2022Updated 3 years ago
yangdongchao / Tim-TSENet
View on GitHub
The source code of Tim-TSENet
☆15Apr 22, 2022Updated 4 years ago