google-deepmind / slowfast_nfnets
☆30Updated 2 years ago
Alternatives and similar repositories for slowfast_nfnets:
Users that are interested in slowfast_nfnets are comparing it to the libraries listed below
- ARCH: Audio Representations benCHmark☆40Updated 5 months ago
- PyTorch Dataset for Speech and Music audio☆73Updated 7 months ago
- Training code and trained checkpoints for ASGAN.☆62Updated last year
- A collection of audio autoencoders, in PyTorch.☆39Updated last year
- Source code for training models and using the hyperbolic interface proposed in our ICASSP 2023 paper, “Hyperbolic Audio Source Separation…☆62Updated last year
- A library built for easier audio self-supervised training, downstream tasks evaluation☆111Updated 5 months ago
- Official implementation of "Learning Music Audio Representations Via Weak Language Supervision" (ICASSP 2022)☆46Updated 2 months ago
- Masked Modeling Duo: Towards a Universal Audio Pre-training Framework☆84Updated 6 months ago
- A list of papers about audio captioning☆78Updated 2 years ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆36Updated last year
- EVAR ~ Evaluation package for Audio Representations☆46Updated 3 months ago
- ☆79Updated last year
- ☆32Updated 4 years ago
- ☆56Updated 4 years ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆93Updated 6 months ago
- A list of resources that can help in research for automated audio captioning☆34Updated 3 years ago
- Learning differentiable temporal resolution on time-series data.☆35Updated 2 years ago
- ☆43Updated 2 months ago
- Improving Recording Device Generalization using Impulse Response Augmentation☆12Updated last year
- Simple baseline model for the HEAR benchmark☆23Updated last month
- SA-toolkit: Speaker speech anonymization toolkit in python☆23Updated 3 weeks ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆49Updated last year
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆74Updated 3 years ago
- ☆29Updated 2 months ago
- ☆14Updated last year
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated 2 years ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆39Updated last year
- Inference code for PaSST, using the HEAR API.☆31Updated last year
- Transcribing Speech with Multinomial Diffusion, training code and models.☆76Updated last year
- audioLIME: Listenable Explanations Using Source Separation☆34Updated 3 years ago