☆125May 13, 2025Updated 11 months ago
Alternatives and similar repositories for PretrainedSED
Users that are interested in PretrainedSED are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Jun 12, 2025Updated 10 months ago
- ☆28Oct 17, 2024Updated last year
- Prediction of sound event bounding boxes (SEBBs)☆32Aug 2, 2024Updated last year
- This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".☆162Aug 24, 2025Updated 7 months ago
- Source code for Consistent ensemble distillation for audio tagging☆63Mar 20, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆22Feb 10, 2025Updated last year
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 10 months ago
- ☆41Feb 18, 2026Updated last month
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 11 months ago
- ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation☆29Mar 10, 2024Updated 2 years ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 4 months ago
- The program ranked first in Audio-only track of DCASE2024 Challenge task3.☆22Mar 2, 2026Updated last month
- ☆38Jul 4, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆131Sep 2, 2025Updated 7 months ago
- A library built for easier audio self-supervised training, downstream tasks evaluation☆136Sep 25, 2025Updated 6 months ago
- This repository aims to collect Transformer-based sound event detection (SED) algorithms.☆96Feb 10, 2026Updated 2 months ago
- Masked Modeling Duo: Towards a Universal Audio Pre-training Framework☆146Feb 23, 2026Updated last month
- Official code for SongEcho☆55Mar 3, 2026Updated last month
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆339Nov 20, 2024Updated last year
- Efficient Training of Audio Transformers with Patchout☆374Jan 12, 2024Updated 2 years ago
- ☆24Jul 30, 2025Updated 8 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆65Dec 26, 2025Updated 3 months ago
- ☆11Dec 28, 2023Updated 2 years ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆130Jun 11, 2024Updated last year
- ☆44Jan 13, 2025Updated last year
- EVAR ~ Evaluation package for Audio Representations☆75Feb 19, 2026Updated last month
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆200Dec 13, 2024Updated last year
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆84Nov 7, 2025Updated 5 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆90Dec 20, 2024Updated last year
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆33Dec 23, 2025Updated 3 months ago
- Repo associated to the DESED dataset, download and creation of data☆150Jul 16, 2024Updated last year
- Extract phoneme-level timestamps from speeh audio.☆125Apr 2, 2026Updated last week
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆18Nov 19, 2024Updated last year
- ☆13Jan 3, 2024Updated 2 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 8 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year