rainavyas / prepend_acoustic_attack
Prepend universal audio attack segment to mute Whisper
☆12Updated this week
Related projects ⓘ
Alternatives and complementary repositories for prepend_acoustic_attack
- EMO-SUPERB submission☆28Updated 2 months ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆42Updated 4 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆77Updated 3 months ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆90Updated last month
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆36Updated 5 months ago
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆50Updated 2 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆81Updated 2 weeks ago
- Continual Learning Method RAWM for ICML 2023☆20Updated last month
- A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024☆13Updated 3 months ago
- The open source code for LLM-Codec☆114Updated 2 months ago
- ☆14Updated last month
- ☆34Updated 6 months ago
- ☆21Updated last month
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆21Updated 2 months ago
- ☆21Updated last week
- ☆35Updated 2 weeks ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆47Updated 2 weeks ago
- ☆28Updated this week
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆33Updated last week
- ☆21Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆62Updated last week
- ☆10Updated 7 months ago
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆79Updated 3 weeks ago
- This repository collects papers related to Speech Tokenizer.☆13Updated 3 weeks ago
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆60Updated 2 weeks ago
- ☆12Updated 3 months ago
- Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆102Updated last month
- This repository presents a subset of our proposed FSD dataset for song deepfake detection.☆19Updated last month
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆48Updated 2 weeks ago
- Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))☆28Updated 3 months ago