fschmid56/PretrainedSED

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fschmid56/PretrainedSED)

fschmid56 / PretrainedSED

☆145

Alternatives and similar repositories for PretrainedSED

Users that are interested in PretrainedSED are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

b-sigpro / sed-hsmm
View on GitHub
Onset-and-Offset-Aware Sound Event Detection
☆21Feb 10, 2025Updated last year
theMoro / EfficientSED
View on GitHub
☆22Jun 12, 2025Updated last year
CPJKU / cpjku_dcase24
View on GitHub
☆29Oct 17, 2024Updated last year
Audio-WestlakeU / ATST-SED
View on GitHub
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
☆174Jun 8, 2026Updated last month
cai525 / Transformer4SED
View on GitHub
This repository aims to collect Transformer-based sound event detection (SED) algorithms.
☆104Feb 10, 2026Updated 5 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
flamed-tts / Flamed-TTS
View on GitHub
This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …
☆57Aug 9, 2025Updated 11 months ago
merlresearch / sebbs
View on GitHub
Prediction of sound event bounding boxes (SEBBs)
☆35Aug 2, 2024Updated last year
JHU-LCAP / FlexSED
View on GitHub
open-vocabulary sound event detection
☆53Dec 17, 2025Updated 7 months ago
Andong-Li-speech / BridgeVoC
View on GitHub
This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".
☆67Nov 5, 2025Updated 8 months ago
lsfhuihuiff / SongEcho_ICLR2026
View on GitHub
Official code for SongEcho
☆64Mar 3, 2026Updated 4 months ago
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
fluxions-ai / stftvae
View on GitHub
Inference for the STFT-VAE continuous audio codec (24kHz, 3.125Hz latent)
☆43Jul 12, 2026Updated last week
lysanderism / TimeAudio
View on GitHub
The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…
☆30Nov 18, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
RicherMans / CED
View on GitHub
Source code for Consistent ensemble distillation for audio tagging
☆75Mar 20, 2026Updated 4 months ago
fgnt / sed_scores_eval
View on GitHub
☆41Feb 18, 2026Updated 5 months ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
xiquan-li / FineLAP
View on GitHub
[ACL 2026 Main] FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pre-training
☆36Apr 20, 2026Updated 3 months ago
bfs18 / armel
View on GitHub
poorman's ar-dit tts
☆45Dec 31, 2025Updated 6 months ago
ryota-komatsu / speech_resynth
View on GitHub
Speech Resynthesis and Language Modeling
☆27Jun 11, 2025Updated last year
astradzhao / music-rfm
View on GitHub
Open Source code for our paper, Steering Autoregressive Music Generation with Recursive Feature Machines (Zhao et al., 2025). aka MusicRF…
☆40Oct 26, 2025Updated 8 months ago
inverse-ai / FINALLY-Speech-Enhancement
View on GitHub
FINALLY: Fast and universal speech enhancement model delivering studio-quality audio for a wide range of recordings.
☆28Apr 1, 2026Updated 3 months ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Audio-WestlakeU / audiossl
View on GitHub
A library built for easier audio self-supervised training, downstream tasks evaluation
☆140Sep 25, 2025Updated 10 months ago
frednam93 / MDFD-SED
View on GitHub
☆21Mar 6, 2025Updated last year
kandinskylab / kvae-audio
View on GitHub
KVAE-Audio: a continuous full-band audio waveform autoencoder
☆101Updated this week
RicherMans / Dasheng
View on GitHub
Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"
☆86Nov 7, 2025Updated 8 months ago
apple-yinhan / TQ-SED
View on GitHub
☆24Mar 19, 2025Updated last year
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
Tencent / SongBench
View on GitHub
☆51Apr 30, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
nttcslab / dcase2025_task4_baseline
View on GitHub
☆18Apr 16, 2026Updated 3 months ago
Shy-98 / MELLE
View on GitHub
Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"
☆41Jun 28, 2025Updated last year
XXH333 / WordVoice-main
View on GitHub
The inference and trainging code for WordVoice.
☆61Jul 17, 2026Updated last week
fschmid56 / EfficientAT
View on GitHub
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …
☆353Nov 20, 2024Updated last year
primepake / learnable-speech
View on GitHub
This repo is text to speech with learnable audio encoder without alignment with transcript reference
☆54Sep 20, 2025Updated 10 months ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago