Sara-Ahmed / ASiTView external linksLinks
ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation
☆28Mar 10, 2024Updated last year
Alternatives and similar repositories for ASiT
Users that are interested in ASiT are comparing it to the libraries listed below
Sorting:
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 2 years ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- [AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification☆12Mar 10, 2025Updated 11 months ago
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆13Dec 12, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Python implementation of the paper "Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection"☆27Apr 26, 2024Updated last year
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- Code repository for ‘Adaptive Differential Denoising for Respiratory Sounds Classification’☆20Dec 19, 2025Updated last month
- Mutiband version of HIFIGAN☆19Nov 6, 2020Updated 5 years ago
- ☆66Aug 16, 2023Updated 2 years ago
- ☆22Apr 4, 2023Updated 2 years ago
- 基于FreeVC的歌声转换☆21Dec 16, 2022Updated 3 years ago
- ☆113May 13, 2025Updated 9 months ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Learning differentiable temporal resolution on time-series data.☆36Nov 12, 2022Updated 3 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- ☆12Feb 3, 2026Updated last week
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- Streaming Audiotransformers for online Audio tagging☆51Jun 14, 2024Updated last year
- Audio tokenization, in the fastest way possible!☆53Aug 26, 2024Updated last year
- DysfluentWFST☆17Nov 13, 2025Updated 3 months ago
- ARCH: Audio Representations benCHmark☆53Aug 26, 2024Updated last year
- A simple app for recording speech datasets.☆26Jun 27, 2022Updated 3 years ago
- Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"☆12Oct 31, 2024Updated last year
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…☆57Aug 7, 2023Updated 2 years ago
- [SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Apr 18, 2023Updated 2 years ago
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆30Jul 9, 2024Updated last year
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".☆414Aug 14, 2022Updated 3 years ago
- 4G GPU & 10 Minutes for train☆12Aug 9, 2023Updated 2 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- Chinese polyphone disambiguation for Text-to-Speech application☆42Jun 11, 2024Updated last year
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago