ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation
☆29Mar 10, 2024Updated 2 years ago
Alternatives and similar repositories for ASiT
Users that are interested in ASiT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification☆12Mar 10, 2025Updated last year
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 3 years ago
- Python implementation of the paper "Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection"☆29Apr 26, 2024Updated last year
- Learning differentiable temporal resolution on time-series data.☆37Nov 12, 2022Updated 3 years ago
- ☆14Aug 12, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- text to speech☆10Mar 19, 2024Updated 2 years ago
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆29Jul 9, 2024Updated last year
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆13Feb 18, 2026Updated last month
- ☆118May 13, 2025Updated 10 months ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- EVAR ~ Evaluation package for Audio Representations☆75Feb 19, 2026Updated last month
- Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".☆418Aug 14, 2022Updated 3 years ago
- ☆67Aug 16, 2023Updated 2 years ago
- Official implementation of Hierarchical Spectrogram Transformers (HST)☆20Oct 10, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆39Jul 31, 2024Updated last year
- Code of our ISMIR 2025 paper - D. Afchar, G. Meseguer Brocal, K. Akesbi, R. Hennequin☆35Nov 12, 2025Updated 4 months ago
- ARCH: Audio Representations benCHmark☆54Aug 26, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"☆41Aug 14, 2025Updated 7 months ago
- Keras implementation of "Chord Generation from Symbolic Melody Using BLSTM Networks"☆13Aug 8, 2021Updated 4 years ago
- Streaming Audiotransformers for online Audio tagging☆53Jun 14, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 10 months ago
- ☆12May 30, 2023Updated 2 years ago
- ☆22Apr 4, 2023Updated 2 years ago
- Code repository for ‘Adaptive Differential Denoising for Respiratory Sounds Classification’☆21Dec 19, 2025Updated 3 months ago
- The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"☆480Sep 18, 2025Updated 6 months ago
- 基于FreeVC的歌声转换☆21Dec 16, 2022Updated 3 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Mutiband version of HIFIGAN☆19Nov 6, 2020Updated 5 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code for our paper "Acoustic Features Fusion using Attentive Multi-channel Deep Architecture" in Keras and tensorflow☆26Nov 23, 2018Updated 7 years ago
- Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations☆100Feb 20, 2026Updated last month
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆41Aug 29, 2024Updated last year
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆93Jun 9, 2022Updated 3 years ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆83Nov 7, 2025Updated 4 months ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆13Mar 11, 2025Updated last year
- Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included…☆40Sep 5, 2022Updated 3 years ago