tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0
View external linksLinks

☆18

Alternatives and similar repositories for Federated-Learning-ASR-based-on-wav2vec-2.0

Users that are interested in Federated-Learning-ASR-based-on-wav2vec-2.0 are comparing it to the libraries listed below

Sorting:

naver / multilingual-distilwhisper
View on GitHub
This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.
☆33Oct 23, 2025Updated 3 months ago
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆11Jun 14, 2024Updated last year
zds-potato / multilingual-phonetic-sv
View on GitHub
☆10Dec 22, 2023Updated 2 years ago
zhang-tuo-pdf / FedAudio
View on GitHub
[ICASSP 2023] FedAudio: A Federated Learning Benchmark for Audio and Speech Tasks
☆51Feb 21, 2024Updated last year
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆12Feb 5, 2025Updated last year
xiaoxue1117 / speech-mamba-public
View on GitHub
☆14Nov 26, 2024Updated last year
gweltou / anaouder-cli
View on GitHub
Anaouder mouezh e Brezhoneg gant Vosk
☆16Nov 24, 2025Updated 2 months ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated 10 months ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated last year
Miamoto / Conformer-NTM
View on GitHub
☆16Nov 9, 2023Updated 2 years ago
AIRI-Institute / AI4TALK
View on GitHub
☆13Dec 7, 2022Updated 3 years ago
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
ErikEkstedt / conv_ssl
View on GitHub
☆14Feb 9, 2023Updated 3 years ago
Xianchao-Wu / wenet-deep-sparse-conformer
View on GitHub
☆15Aug 25, 2022Updated 3 years ago
THUsatlab / BERT-LID
View on GitHub
Leveraging BERT to Improve Spoken Language Identification
☆17Nov 22, 2022Updated 3 years ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated last year
skit-ai / Map-Mix
View on GitHub
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…
☆18Feb 17, 2023Updated 2 years ago
sholokhovalexey / online-speaker-clustering
View on GitHub
☆18Mar 4, 2023Updated 2 years ago
7Xin / DPI-TTS
View on GitHub
☆13Sep 12, 2024Updated last year
aispeech-lab / w2v-cif-bert
View on GitHub
☆37Jun 28, 2021Updated 4 years ago
mubingshen / MLC-SLM-Baseline
View on GitHub
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…
☆49May 14, 2025Updated 9 months ago
kamperh / vqwordseg
View on GitHub
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
☆39Mar 4, 2024Updated last year
YoshikiMas / madeon-asr
View on GitHub
[SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition
☆18Dec 1, 2024Updated last year
rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆17Jul 22, 2024Updated last year
NKU-HLT / KNN-CTC
View on GitHub
[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
☆42Mar 20, 2024Updated last year
liu12366262626 / AlignVSR
View on GitHub
Visual Speech Recongnition
☆19Dec 24, 2024Updated last year
Yip-Jia-Qi / codecformer
View on GitHub
☆21Jul 15, 2024Updated last year
laboroai / TEDxJP-10K
View on GitHub
☆24Jan 14, 2021Updated 5 years ago
Srijith-rkr / KAUST-Whisper-Adapter
View on GitHub
INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!
☆43Sep 11, 2023Updated 2 years ago
ms-dot-k / TMT
View on GitHub
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
☆18May 23, 2024Updated last year
saarus72 / text_normalization
View on GitHub
T5-based (russian) text normalization
☆25Jan 25, 2024Updated 2 years ago
kuan2jiu99 / Awesome-Speech-Generation
View on GitHub
Survey on speech generation work.
☆21Nov 26, 2023Updated 2 years ago
DanielMengLiu / AudioVisualLip
View on GitHub
☆24Feb 20, 2024Updated last year
Lhx94As / E2E-language-diarization
View on GitHub
Source code of paper <End-to-End Language Diarization for Bilingual Code-switching Speech>
☆19Jan 23, 2022Updated 4 years ago
Tonyyouyou / Mamba-in-Speech
View on GitHub
☆54Jul 1, 2024Updated last year
MingLunHan / CIF-ColDec
View on GitHub
[ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection
☆25May 18, 2023Updated 2 years ago
voiceboxneurips / voicebox
View on GitHub
☆23Jan 6, 2023Updated 3 years ago

tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0View external linksLinks

Alternatives and similar repositories for Federated-Learning-ASR-based-on-wav2vec-2.0

tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0
View external linksLinks