Adapting a ConvNeXt model to audio classification on AudioSet
☆27Feb 19, 2025Updated last year
Alternatives and similar repositories for audioset-convnext-inf
Users that are interested in audioset-convnext-inf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆23Dec 17, 2025Updated 5 months ago
- ☆11May 30, 2023Updated 3 years ago
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- ☆10Jun 11, 2024Updated last year
- Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…☆16Aug 9, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.☆11Jun 7, 2022Updated 3 years ago
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 9 months ago
- ☆15May 18, 2024Updated 2 years ago
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆345Nov 20, 2024Updated last year
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Dec 6, 2023Updated 2 years ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated 2 years ago
- This sample includes simeple CNN classifier for music and audio-folder dataloader just like ImageFolder in torchvision.☆11Oct 30, 2018Updated 7 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- Audio Captioning datasets for PyTorch.☆128Mar 25, 2026Updated 2 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Efficient Training of Audio Transformers with Patchout☆382Jan 12, 2024Updated 2 years ago
- Submission for task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques"…☆16Sep 19, 2022Updated 3 years ago
- ☆18May 28, 2025Updated last year
- ☆51Mar 5, 2026Updated 2 months ago
- ☆16May 26, 2022Updated 4 years ago
- ☆18Apr 16, 2026Updated last month
- ☆12Mar 11, 2025Updated last year
- Swift wrapper around Pocketsphinx☆15Jan 4, 2019Updated 7 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.☆115Jun 4, 2025Updated 11 months ago
- A unified model for zero-shot singing voice conversion and synthesis☆22Nov 30, 2022Updated 3 years ago
- ☆14Oct 7, 2021Updated 4 years ago
- Forced alignment decoder for Whisper.☆16Mar 13, 2024Updated 2 years ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆45Mar 25, 2024Updated 2 years ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 9 months ago
- Learning differentiable temporal resolution on time-series data.☆37Nov 12, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Transformer Model to detect deepfakes from popular datasets. Predictions made on embeddings (features) generated by a different ViT model…☆14Nov 27, 2023Updated 2 years ago
- Monocular MSCKF_VIO 单目版本msckf_vio,包括算法模型详细数学推导文档msckf1.0.doc☆13Jun 7, 2019Updated 6 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- 6 DoF Directional Room Impulse Response (RIR) with Dense Loudspeaker Grid☆17Aug 31, 2023Updated 2 years ago
- ☆18Jul 22, 2024Updated last year
- Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".☆150Jul 13, 2023Updated 2 years ago
- Code for the paper "Self-Supervised Learning for Anomalous Sound Detection"☆40May 13, 2024Updated 2 years ago