ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation
☆29Mar 10, 2024Updated 2 years ago
Alternatives and similar repositories for ASiT
Users that are interested in ASiT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification☆12Mar 10, 2025Updated last year
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 3 years ago
- Python implementation of the paper "Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection"☆29Apr 26, 2024Updated last year
- Learning differentiable temporal resolution on time-series data.☆37Nov 12, 2022Updated 3 years ago
- ☆14Aug 12, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- text to speech☆10Mar 19, 2024Updated 2 years ago
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆29Jul 9, 2024Updated last year
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆13Feb 18, 2026Updated last month
- ☆125May 13, 2025Updated 11 months ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- EVAR ~ Evaluation package for Audio Representations☆75Feb 19, 2026Updated last month
- Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".☆418Aug 14, 2022Updated 3 years ago
- ☆68Aug 16, 2023Updated 2 years ago
- Official implementation of Hierarchical Spectrogram Transformers (HST)☆20Oct 10, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" [IEEE MLSP 2025] …☆39Jul 31, 2024Updated last year
- Code of our ISMIR 2025 paper - D. Afchar, G. Meseguer Brocal, K. Akesbi, R. Hennequin☆36Nov 12, 2025Updated 5 months ago
- ARCH: Audio Representations benCHmark☆55Aug 26, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"☆42Aug 14, 2025Updated 8 months ago
- Keras implementation of "Chord Generation from Symbolic Melody Using BLSTM Networks"☆13Aug 8, 2021Updated 4 years ago
- Streaming Audiotransformers for online Audio tagging☆53Jun 14, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 11 months ago
- ☆11May 30, 2023Updated 2 years ago
- ☆22Apr 4, 2023Updated 3 years ago
- Code repository for ‘Adaptive Differential Denoising for Respiratory Sounds Classification’☆22Dec 19, 2025Updated 3 months ago
- The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"☆485Sep 18, 2025Updated 6 months ago
- 基于FreeVC的歌声转换☆21Dec 16, 2022Updated 3 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Mutiband version of HIFIGAN☆19Nov 6, 2020Updated 5 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for our paper "Acoustic Features Fusion using Attentive Multi-channel Deep Architecture" in Keras and tensorflow☆26Nov 23, 2018Updated 7 years ago
- Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations☆100Feb 20, 2026Updated last month
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆41Aug 29, 2024Updated last year
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆93Jun 9, 2022Updated 3 years ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆84Nov 7, 2025Updated 5 months ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆13Mar 11, 2025Updated last year
- Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included…☆40Sep 5, 2022Updated 3 years ago