llm-lab-org/CLASP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/llm-lab-org/CLASP)

llm-lab-org / CLASP

CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval

☆13

Alternatives and similar repositories for CLASP

Users that are interested in CLASP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
JacobLinCool / MPSENet
View on GitHub
Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
☆22Nov 1, 2024Updated last year
ex3ndr / supervoice-enhance
View on GitHub
Supervoice diffusion enhance
☆28Jul 15, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
dreamtheater123 / VoxEval
View on GitHub
Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
☆24Jun 16, 2025Updated last year
gzhu06 / Cacophony
View on GitHub
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
☆49Jan 19, 2026Updated 6 months ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
MontrealCorpusTools / kalpy
View on GitHub
Pybind11 bindings for Kaldi
☆15Jul 11, 2026Updated last week
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
pilot7747 / VoxDIY
View on GitHub
This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.
☆16Jul 22, 2021Updated 4 years ago
speechcatcher-asr / speechcatcher-data
View on GitHub
☆11Sep 5, 2025Updated 10 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OlaWod / PitchVC
View on GitHub
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
☆35Jun 6, 2024Updated 2 years ago
LilDevsy0117 / Ultra-Sortformer
View on GitHub
Ultra-Sortformer for Scalable Speaker Diarization
☆27Apr 9, 2026Updated 3 months ago
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
microsoft / Distill-MOS
View on GitHub
Distillation of Self-Supervised Representation-Based Speech Quality Assessment
☆49May 15, 2025Updated last year
FrenchKrab / datasets-pyannote
View on GitHub
Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)
☆15Oct 22, 2025Updated 8 months ago
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
maum-ai / sane-tts
View on GitHub
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Jun 30, 2023Updated 3 years ago
Auroraaa86 / LCS-CTC
View on GitHub
For IEEE ASRU(2025)
☆15Jun 21, 2025Updated last year
freds0 / kabooks
View on GitHub
KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…
☆13Mar 24, 2023Updated 3 years ago
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
avishaiElmakies / unsupervised_speech_segmentation_using_slm
View on GitHub
☆20Jan 8, 2025Updated last year
aqtq314 / VogenSVS
View on GitHub
☆15Apr 16, 2026Updated 3 months ago
aholab / AhoTTS
View on GitHub
Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…
☆18Jan 15, 2026Updated 6 months ago
k2-fsa / sherpa-mlx
View on GitHub
sherpa with mlx
☆15Aug 2, 2025Updated 11 months ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
corticph / error-align
View on GitHub
Text-to-text alignment algorithm for speech recognition error analysis.
☆31Jun 23, 2026Updated 3 weeks ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
Martossien / transcria
View on GitHub
Self-hosted meeting transcription portal — speech-to-text, speaker diarization, LLM-corrected transcripts, structured summaries and Word …
☆24Updated this week
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
huutuongtu / Lightvoc
View on GitHub
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18May 17, 2024Updated 2 years ago