Adapting a ConvNeXt model to audio classification on AudioSet
☆27Feb 19, 2025Updated last year
Alternatives and similar repositories for audioset-convnext-inf
Users that are interested in audioset-convnext-inf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆22Dec 17, 2025Updated 3 months ago
- ☆12May 30, 2023Updated 2 years ago
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…☆16Aug 9, 2021Updated 4 years ago
- Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.☆11Jun 7, 2022Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 7 months ago
- ☆15May 18, 2024Updated last year
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆335Nov 20, 2024Updated last year
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Dec 6, 2023Updated 2 years ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated 2 years ago
- This sample includes simeple CNN classifier for music and audio-folder dataloader just like ImageFolder in torchvision.☆11Oct 30, 2018Updated 7 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- Audio Captioning datasets for PyTorch.☆128Updated this week
- Efficient Training of Audio Transformers with Patchout☆371Jan 12, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Submission for task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques"…☆16Sep 19, 2022Updated 3 years ago
- ☆18May 28, 2025Updated 10 months ago
- ☆51Mar 5, 2026Updated 3 weeks ago
- ☆16May 26, 2022Updated 3 years ago
- ☆19Aug 16, 2025Updated 7 months ago
- ☆12Mar 11, 2025Updated last year
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated last month
- Swift wrapper around Pocketsphinx☆15Jan 4, 2019Updated 7 years ago
- A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.☆114Jun 4, 2025Updated 9 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- A unified model for zero-shot singing voice conversion and synthesis☆22Nov 30, 2022Updated 3 years ago
- ☆14Oct 7, 2021Updated 4 years ago
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆45Mar 25, 2024Updated 2 years ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 7 months ago
- Learning differentiable temporal resolution on time-series data.☆37Nov 12, 2022Updated 3 years ago
- Transformer Model to detect deepfakes from popular datasets. Predictions made on embeddings (features) generated by a different ViT model…☆14Nov 27, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- 6 DoF Directional Room Impulse Response (RIR) with Dense Loudspeaker Grid☆17Aug 31, 2023Updated 2 years ago
- ☆14Feb 19, 2025Updated last year
- ☆17Jul 22, 2024Updated last year
- Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".☆150Jul 13, 2023Updated 2 years ago
- Code for the paper "Self-Supervised Learning for Anomalous Sound Detection"☆40May 13, 2024Updated last year