☆16Mar 25, 2025Updated 11 months ago
Alternatives and similar repositories for RADKA-CSS
Users that are interested in RADKA-CSS are comparing it to the libraries listed below
Sorting:
- ☆11Mar 24, 2025Updated 11 months ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆26Feb 22, 2024Updated 2 years ago
- This is a general framework for fake audio detection using pytorch lightning☆27Jul 24, 2025Updated 7 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆23Jun 6, 2025Updated 8 months ago
- [ACM MM 24] GROOT:Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis☆20Mar 24, 2025Updated 11 months ago
- ☆23Oct 23, 2024Updated last year
- ☆21Jun 16, 2021Updated 4 years ago
- Synthesis speech detection based on Breathing-Talking-Silence sounds☆21Sep 3, 2025Updated 5 months ago
- Implementation of MathReader, Text-to-Speech for Mathematical Documents☆27Sep 23, 2025Updated 5 months ago
- This is a winter of code project aimed at speech enhancement of text to speech models.☆24Feb 6, 2022Updated 4 years ago
- [ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"☆33May 7, 2025Updated 9 months ago
- ☆27Jan 17, 2024Updated 2 years ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆59Jun 20, 2024Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- The official implementation of the DIFFA series for dLLM-based large audio language model☆59Feb 2, 2026Updated last month
- This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".☆66Dec 13, 2024Updated last year
- ☆38Apr 3, 2025Updated 10 months ago
- Speech samples and code of BEdit-TTS☆34Oct 8, 2023Updated 2 years ago
- code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection☆48May 1, 2023Updated 2 years ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆96Nov 9, 2024Updated last year
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆15Sep 1, 2024Updated last year
- Whisper finetuning☆16Apr 9, 2025Updated 10 months ago
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆23Updated this week
- ☆11Aug 20, 2025Updated 6 months ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- Russian phonetical transcription☆11Nov 19, 2025Updated 3 months ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Vosk Speech Recognition API) and TRANSLATED SUBTITLE FILE…☆11May 5, 2024Updated last year
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- ☆13Oct 9, 2025Updated 4 months ago
- ☆11Aug 11, 2023Updated 2 years ago
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆43Mar 3, 2025Updated 11 months ago
- Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking☆45Aug 23, 2024Updated last year
- A list of tools, papers and code related to Fake Audio Detection.☆224Updated this week
- LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement☆46Mar 10, 2025Updated 11 months ago