Implementation of SoundStream, an end-to-end neural audio codec
☆32Jun 11, 2023Updated 2 years ago
Alternatives and similar repositories for SoundStream
Users that are interested in SoundStream are comparing it to the libraries listed below
Sorting:
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆13Jan 27, 2025Updated last year
- ☆15May 11, 2025Updated 9 months ago
- Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint☆78Feb 9, 2026Updated 3 weeks ago
- Huggingface Implementation of AV-HuBERT on the MuAViC Dataset☆18Mar 6, 2025Updated 11 months ago
- For audio visualization and playback in Jupyter notebooks.☆17Nov 25, 2025Updated 3 months ago
- This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf☆420Apr 21, 2022Updated 3 years ago
- Compute WER and SER for speech recognition evaluation☆26Dec 15, 2025Updated 2 months ago
- Rust crate for some audio utilities☆27Mar 8, 2025Updated 11 months ago
- ☆32Feb 4, 2026Updated last month
- Google Scholar自搜小脚本,每次开启命令行即显示当前citation。Small Script displaying current citation count each time the shell is opened.☆22Mar 3, 2025Updated last year
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…☆23Mar 14, 2024Updated last year
- Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"☆108Dec 20, 2025Updated 2 months ago
- [ICASSPW] A Vector Quantized Masked AutoEncoder for speech emotion recognition☆29Mar 4, 2024Updated 2 years ago
- ☆31Mar 3, 2023Updated 3 years ago
- JAX Implementations of Descript Audio Codec and EnCodec☆33Mar 30, 2025Updated 11 months ago
- faster inference☆28Jan 20, 2025Updated last year
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆34Feb 21, 2025Updated last year
- Unofficial Pytorch Lightning Implementation of "Real-time Speech Frequency Bandwidth Extension"☆41Oct 20, 2025Updated 4 months ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆38Feb 23, 2023Updated 3 years ago
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆246Mar 7, 2025Updated 11 months ago
- An R package for analyzing linguistic alignment between partners in conversation transcripts☆14Jan 30, 2026Updated last month
- TASU: A New Style of Alignment of Speech LLM with only Text Training Data, zero-shot on ASR and Other SU tasks☆22Jan 19, 2026Updated last month
- A PyTorch implementation of "Continuous Relaxation Training of Discrete Latent Variable Image Models"☆75Mar 25, 2020Updated 5 years ago
- The MATLAB code of the local mean decomposition using empirical optimal envelope☆13Jan 6, 2022Updated 4 years ago
- Anki add-on that adds Pinyin and Zhuyin readings above Chinese characters in any field.☆12Sep 23, 2025Updated 5 months ago
- unofficial implementation of the High Fidelity Neural Audio Compression☆176Aug 15, 2024Updated last year
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- ☆43Feb 8, 2025Updated last year
- The sparse Bayesian learning sandbox☆11Jul 4, 2021Updated 4 years ago
- A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microp…☆10Dec 16, 2017Updated 8 years ago
- ☆12Dec 30, 2020Updated 5 years ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated last year
- ☆12Mar 23, 2020Updated 5 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge☆10Aug 8, 2023Updated 2 years ago
- Training a BERT model from scratch.☆11Oct 15, 2023Updated 2 years ago
- ☆14Apr 23, 2025Updated 10 months ago
- A Pytorch Lightning WGAN-gp to generate faces☆11Jan 26, 2021Updated 5 years ago