HarunoriKawano / Wav2vec2.0View external linksLinks
Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations" in Pytorch.
☆55May 19, 2023Updated 2 years ago
Alternatives and similar repositories for Wav2vec2.0
Users that are interested in Wav2vec2.0 are comparing it to the libraries listed below
Sorting:
- Trustworthy Speech Emotion Recognition☆13May 22, 2023Updated 2 years ago
- ☆12Feb 9, 2021Updated 5 years ago
- Wav2vec 2.0 Self-Supervised Pretraining☆58Feb 6, 2025Updated last year
- Text to Speech with PyTorch (English and Mongolian)☆12May 3, 2020Updated 5 years ago
- SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge☆12Jun 11, 2024Updated last year
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆38Feb 27, 2022Updated 3 years ago
- a simplified version of wav2vec(1.0, vq, 2.0) in fairseq☆167Sep 21, 2020Updated 5 years ago
- Dynamic time warping for audio matching and alignment☆22Mar 15, 2018Updated 7 years ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆53Jun 29, 2024Updated last year
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Jul 31, 2023Updated 2 years ago
- LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models☆25Aug 11, 2024Updated last year
- An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"☆31May 31, 2023Updated 2 years ago
- Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.☆68Jul 19, 2025Updated 6 months ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- CARMA Streets is a component of CARMA ecosystem, which enables such a coordination among different transportation users. This component p…☆11Aug 21, 2025Updated 5 months ago
- Phase-aware speech enchancement with Deep Complex U-Net☆135Mar 22, 2023Updated 2 years ago
- REPeating Pattern Extraction Technique (REPET) in Python for audio source separation: original REPET, REPET extended, adaptive REPET, REP…☆33Feb 16, 2024Updated 2 years ago
- SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems☆39Nov 1, 2023Updated 2 years ago
- A lightweight library to read/write wave audio files to/from lists of native Python types.☆12Jun 10, 2024Updated last year
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80May 20, 2023Updated 2 years ago
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)☆149Nov 22, 2022Updated 3 years ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆34Jun 25, 2021Updated 4 years ago
- The GitHub repository for the paper "Denoising Application of Magnetotelluric Low-Frequency Signal Processing"☆11Feb 22, 2023Updated 2 years ago
- Supporting code for instrumentation courses at Universidade Nova de Lisboa - Faculdade de Ciência de Lisboa☆16Oct 7, 2022Updated 3 years ago
- Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complet…☆34Dec 28, 2025Updated last month
- Tensorflow training scripts for depthwise separable convolutional neural networks for keyword spotting, and C++ code for deployment.☆39Apr 2, 2020Updated 5 years ago
- Chainer implementation of between-class learning for sound recognition https://arxiv.org/abs/1711.10282☆96Mar 27, 2018Updated 7 years ago
- Parkinson’s Disease Classification from Speech Data using multiple Machine Learning approaches. This was implemented using scikit-learn P…☆14Feb 2, 2020Updated 6 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- 我的前端 demo 合集☆11Oct 11, 2024Updated last year
- Tool for slot extraction from text☆15Oct 23, 2022Updated 3 years ago
- MV-RAG combines retrieval with multi-view generation to create accurate 3D-consistent visuals. By retrieving reference images and text, i…☆23Nov 29, 2025Updated 2 months ago
- ☆43Dec 1, 2025Updated 2 months ago
- ConMamba for Automatic Speech Recognition☆102Aug 12, 2024Updated last year
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer☆38Feb 17, 2025Updated last year
- Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.☆10Jun 7, 2022Updated 3 years ago
- Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition☆153Oct 26, 2021Updated 4 years ago
- Pre-trained Wav2vec2.0 for Mandarin☆43Oct 30, 2022Updated 3 years ago