ekazakos / auditory-slow-fastView external linksLinks
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
☆73Sep 27, 2021Updated 4 years ago
Alternatives and similar repositories for auditory-slow-fast
Users that are interested in auditory-slow-fast are comparing it to the libraries listed below
Sorting:
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Mar 3, 2023Updated 2 years ago
- A library built for easier audio self-supervised training, downstream tasks evaluation☆136Sep 25, 2025Updated 4 months ago
- Splits for epic-sounds dataset☆85Aug 2, 2025Updated 6 months ago
- Urban Sound Classification : striving towards a fair comparison☆17Dec 11, 2020Updated 5 years ago
- VGGSound: A Large-scale Audio-Visual Dataset☆350Sep 13, 2021Updated 4 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆41Dec 23, 2023Updated 2 years ago
- ☆18May 6, 2024Updated last year
- Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch☆19Dec 16, 2021Updated 4 years ago
- Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".☆149Jul 13, 2023Updated 2 years ago
- WaveNet auto-ancoders for ZeroSpeech challenge 2020☆37Apr 7, 2022Updated 3 years ago
- Learning differentiable temporal resolution on time-series data.☆36Nov 12, 2022Updated 3 years ago
- Code for the EMNLP 2021 Oral paper "Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search" https://arx…☆12Feb 6, 2023Updated 3 years ago
- Details of the datasets for Few-shot class-incremental audio classification☆11Dec 6, 2023Updated 2 years ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆26Jan 6, 2024Updated 2 years ago
- ☆34Sep 29, 2024Updated last year
- Robust Neural Audio Watermarking with Invertible Dual-Embedding☆30Nov 11, 2024Updated last year
- Localizing Visual Sounds the Hard Way☆82Jul 6, 2022Updated 3 years ago
- Siamese network for unsupervised speech representation learning☆11Oct 12, 2018Updated 7 years ago
- Evaluate EfficientAT models on the Holistic Evaluation of Audio Representations Benchmark.☆32Jun 23, 2023Updated 2 years ago
- ☆33Mar 22, 2022Updated 3 years ago
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆90Jun 9, 2022Updated 3 years ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆54Jan 29, 2024Updated 2 years ago
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.☆13Feb 6, 2021Updated 5 years ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆17Dec 20, 2022Updated 3 years ago
- PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…☆16Jul 31, 2025Updated 6 months ago
- Temporal Compact Bilinear Pooling (TCBP)