Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"
☆31Apr 29, 2022Updated 3 years ago
Alternatives and similar repositories for PSL
Users that are interested in PSL are comparing it to the libraries listed below
Sorting:
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Mar 6, 2023Updated 2 years ago
- Streaming Audiotransformers for online Audio tagging☆52Jun 14, 2024Updated last year
- ☆15Jul 11, 2022Updated 3 years ago
- ☆13Nov 22, 2022Updated 3 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Materials of public talks given By SJTU X-LANCE members☆14Dec 3, 2022Updated 3 years ago
- ☆13Oct 27, 2021Updated 4 years ago
- PolEval 2021 Task 1☆15Jun 28, 2022Updated 3 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- ☆15Nov 5, 2021Updated 4 years ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Dec 16, 2022Updated 3 years ago
- Submission to the HEAR2021 Challenge☆17Mar 5, 2022Updated 3 years ago
- Python runtime for WeTextProcessing (does not depend on Pynini)☆48Nov 28, 2025Updated 3 months ago
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Jul 21, 2020Updated 5 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- wake word spotting with kaldi☆19Dec 3, 2020Updated 5 years ago
- ☆22Jun 30, 2021Updated 4 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- The RWTH ASR Toolkit.☆58Updated this week
- A baseline Automatic Speech Recognition system for Polish based on Kaldi.☆18Dec 21, 2021Updated 4 years ago
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- Phonetically-Oriented Word Error Rate☆36May 4, 2019Updated 6 years ago
- ATC-Anno is an annotation tool for Air Traffic Control data that offers automatic semantic and concept annotation.☆12Nov 17, 2023Updated 2 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆33Oct 23, 2025Updated 4 months ago
- A Python-based modular toolbox for building Deep Neural Network models (using PyTorch) for statistical parametric speech synthesis☆23Dec 31, 2021Updated 4 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Apr 15, 2020Updated 5 years ago
- The codebase for Data-driven general-purpose voice activity detection.☆93Aug 3, 2023Updated 2 years ago
- System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection☆28Jul 6, 2022Updated 3 years ago
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆23Mar 18, 2024Updated last year
- ☆21Sep 24, 2018Updated 7 years ago
- (semi) Grapheme-to-Phoneme (G2P) - seq2seq model using PyTorch for Korean☆23Dec 17, 2017Updated 8 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues☆10Aug 18, 2023Updated 2 years ago
- Docker for building an environment for Dutch online and offline ASR.☆12Feb 2, 2021Updated 5 years ago
- [Tiny KWS] SparkNet: Sparse Binarization for Fast Keyword Spotting☆17Aug 26, 2025Updated 6 months ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago