cwang621/blsp

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cwang621/blsp)

cwang621 / blsp

BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing

☆59

Alternatives and similar repositories for blsp

Users that are interested in blsp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cwang621 / blsp-emo
View on GitHub
BLSP-Emo: Towards Empathetic Large Speech-Language Models
☆61Jun 7, 2024Updated 2 years ago
cpdu / vallt
View on GitHub
☆36Mar 14, 2025Updated last year
ituvisionlab / EdVAE
View on GitHub
Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"
☆14Sep 20, 2024Updated last year
WangHelin1997 / SpeechTasks
View on GitHub
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…
☆83Jun 7, 2024Updated 2 years ago
microsoft / Pengi
View on GitHub
An Audio Language model for Audio Tasks
☆322Apr 19, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xydaytoy / EVA
View on GitHub
☆14Apr 16, 2024Updated 2 years ago
OFA-Sys / AIR-Bench
View on GitHub
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
☆133Dec 9, 2024Updated last year
X-LANCE / StoryTTS
View on GitHub
[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
☆141Apr 27, 2024Updated 2 years ago
SLPcourse / Singing-Voice-Conversion
View on GitHub
Project of Singing Voice Conversion.
☆16Oct 27, 2023Updated 2 years ago
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
walker-hyf / FCTalker
View on GitHub
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)
☆26Feb 22, 2024Updated 2 years ago
jh-cha-prml / JELLY
View on GitHub
Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"
☆14Nov 5, 2024Updated last year
asappresearch / simple-tts
View on GitHub
Contains the code associated with the ICLR submission for our text-to-speech diffusion model
☆57Oct 31, 2023Updated 2 years ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
ictnlp / CRESS
View on GitHub
Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".
☆16Oct 25, 2023Updated 2 years ago
YuanGongND / ltu
View on GitHub
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
☆478Apr 24, 2024Updated 2 years ago
kuan2jiu99 / Awesome-Speech-Generation
View on GitHub
Survey on speech generation work.
☆21Nov 26, 2023Updated 2 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
thuhcsi / SnakeGAN
View on GitHub
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Apr 25, 2023Updated 3 years ago
speechnovateur / languagecodec_tmp
View on GitHub
Temporary anonymous version
☆22Mar 20, 2024Updated 2 years ago
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆62Oct 23, 2024Updated last year
ZNLP / Language-Imbalance-Driven-Rewarding
View on GitHub
[ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving
☆25Apr 6, 2026Updated 3 months ago
XXH333 / WordVoice-main
View on GitHub
The inference and trainging code for WordVoice.
☆48Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zhenye234 / LLaSA_inference
View on GitHub
☆43Feb 8, 2025Updated last year
redmist328 / APNet2
View on GitHub
Source code of APNet2, a vocoder
☆60Nov 23, 2023Updated 2 years ago
ASLP-lab / Hum-Dial
View on GitHub
ICASSP2026 HumDial Challenge
☆50May 28, 2026Updated last month
ictnlp / NAST-S2x
View on GitHub
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆78Oct 22, 2024Updated last year
0nutation / SpeechGPT
View on GitHub
SpeechGPT Series: Speech Large Language Models
☆1,402Jul 22, 2024Updated last year
0nutation / USLM
View on GitHub
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆152Sep 14, 2023Updated 2 years ago
CASIA-LM / OpenS2S
View on GitHub
OpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model
☆119Mar 28, 2026Updated 3 months ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
ictnlp / SiLLM
View on GitHub
SiLLM is a Simultaneous Machine Translation (SiMT) Framework. It utilizes a Large Language model as the translation model and employs a t…
☆18Feb 22, 2024Updated 2 years ago
vivian556123 / NeurIPS2024-CoVoMix
View on GitHub
Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
☆67Jan 16, 2025Updated last year
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
Sonata165 / ControllableLyricTranslation
View on GitHub
Code for the paper "Songs Across Borders: Singable and Controllable Neural Lyric Translation"
☆26Feb 3, 2026Updated 5 months ago
anton-jeran / MULTI-AUDIODEC
View on GitHub
This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.
☆54Mar 17, 2025Updated last year