☆118May 13, 2025Updated 10 months ago
Alternatives and similar repositories for PretrainedSED
Users that are interested in PretrainedSED are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Jun 12, 2025Updated 9 months ago
- ☆28Oct 17, 2024Updated last year
- This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".☆161Aug 24, 2025Updated 7 months ago
- Prediction of sound event bounding boxes (SEBBs)☆32Aug 2, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Source code for Consistent ensemble distillation for audio tagging☆60Updated this week
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 9 months ago
- ☆40Feb 18, 2026Updated last month
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 10 months ago
- ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation☆29Mar 10, 2024Updated 2 years ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 4 months ago
- The program ranked first in Audio-only track of DCASE2024 Challenge task3.☆20Mar 2, 2026Updated 3 weeks ago
- ☆38Jul 4, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A library built for easier audio self-supervised training, downstream tasks evaluation☆136Sep 25, 2025Updated 6 months ago
- This repository aims to collect Transformer-based sound event detection (SED) algorithms.☆94Feb 10, 2026Updated last month
- ☆23Jul 30, 2025Updated 7 months ago
- Official code for SongEcho☆52Mar 3, 2026Updated 3 weeks ago
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆335Nov 20, 2024Updated last year
- Efficient Training of Audio Transformers with Patchout☆371Jan 12, 2024Updated 2 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆63Dec 26, 2025Updated 3 months ago
- ☆11Dec 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆130Jun 11, 2024Updated last year
- Masked Modeling Duo: Towards a Universal Audio Pre-training Framework☆140Feb 23, 2026Updated last month
- EVAR ~ Evaluation package for Audio Representations☆75Feb 19, 2026Updated last month
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆198Dec 13, 2024Updated last year
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆83Nov 7, 2025Updated 4 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆90Dec 20, 2024Updated last year
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- ☆33Dec 23, 2025Updated 3 months ago
- Repo associated to the DESED dataset, download and creation of data☆146Jul 16, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Extract phoneme-level timestamps from speeh audio.☆121Feb 28, 2026Updated 3 weeks ago
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆17Nov 19, 2024Updated last year
- ☆20Mar 6, 2025Updated last year
- ☆13Jan 3, 2024Updated 2 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 8 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Polyphonic Sound Detection Score (PSDS)☆16Jan 20, 2020Updated 6 years ago