ErikEkstedt/conv_ssl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ErikEkstedt/conv_ssl)

ErikEkstedt / conv_ssl

☆14

Alternatives and similar repositories for conv_ssl

Users that are interested in conv_ssl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ErikEkstedt / datasets_turntaking
View on GitHub
Datasets for turn-taking research
☆20Dec 21, 2023Updated 2 years ago
ErikEkstedt / vap_turn_taking
View on GitHub
vad
☆26Apr 3, 2023Updated 3 years ago
ErikEkstedt / VoiceActivityProjection
View on GitHub
Voice Activity Projection Models: Self-supervised learning of Turn-taking Events
☆106May 29, 2024Updated 2 years ago
ahclab / turntaking
View on GitHub
☆13Feb 16, 2024Updated 2 years ago
inokoj / VAP-Realtime
View on GitHub
A real-time implementation of Voice Activity Projection (VAP) is aimed at controlling behaviors of spoken dialogue systems, such as turn-…
☆103Jul 24, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
BirgerMoell / tmh
View on GitHub
☆16Oct 7, 2022Updated 3 years ago
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
skit-ai / slu-prosody
View on GitHub
Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…
☆27May 17, 2023Updated 3 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
awasthiabhijeet / Error-Driven-ASR-Personalization
View on GitHub
Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021
☆11Jun 13, 2021Updated 5 years ago
ErikEkstedt / TurnGPT
View on GitHub
TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog
☆71May 18, 2024Updated 2 years ago
ZackHodari / average_prosody
View on GitHub
Code for paper titled "Using generative modelling to produce varied intonation for speech synthesis" submitted to the Speech Synthesis Wo…
☆24Dec 8, 2019Updated 6 years ago
line / WaveTrainerFit
View on GitHub
Official implementation of "Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech G…
☆16Feb 6, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Sreyan88 / Disfluency-Detection-with-Span-Classification
View on GitHub
This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…
☆14Jun 6, 2023Updated 3 years ago
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
vakila / de-stress
View on GitHub
Prototype German Computer-Assisted Pronunciation Training tool for lexical stress errors
☆12Oct 28, 2015Updated 10 years ago
spyysalo / bert-pos
View on GitHub
Part-of-speech tagging using BERT
☆10Nov 14, 2019Updated 6 years ago
edoost / pert
View on GitHub
Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging
☆10Nov 15, 2021Updated 4 years ago
pjyazdian / Gesture2Vec
View on GitHub
This is an official PyTorch implementation of "Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gestu…
☆27Feb 9, 2024Updated 2 years ago
thevoicecompany / gazelle-train
View on GitHub
Joint speech-language model - respond directly to audio!
☆30May 13, 2024Updated 2 years ago
zerospeech / zerospeech2021
View on GitHub
Zerospeech Challenge 2021: validation and evaluation software
☆12Jun 13, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
sinhat98 / adapter-wavlm
View on GitHub
☆46Feb 16, 2023Updated 3 years ago
LouChao98 / nner_as_parsing
View on GitHub
☆16Mar 22, 2023Updated 3 years ago
OpenVoiceOS / ovos-stt-plugin-vosk
View on GitHub
vosk STT plugin for mycroft
☆15Jun 15, 2026Updated last month
tiro-is / tiro-speech-core
View on GitHub
This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core
☆15Jun 19, 2023Updated 3 years ago
bootphon / shennong
View on GitHub
A Python toolbox for speech features extraction
☆166Feb 8, 2023Updated 3 years ago
gaurangbharti1 / wealth-alpaca
View on GitHub
Training Script and Dataset for Wealth Alpaca-LoRa
☆16Apr 7, 2023Updated 3 years ago
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0
View on GitHub
☆18Mar 13, 2024Updated 2 years ago
aholab / AhoTTS
View on GitHub
Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…
☆18Jan 15, 2026Updated 6 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Audio-WestlakeU / UMA-ASR
View on GitHub
This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).
☆35Dec 17, 2024Updated last year
ms-dot-k / TMT
View on GitHub
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
☆18May 23, 2024Updated 2 years ago
Open-Speech-EkStep / crowdsource-dataplatform
View on GitHub
This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…
☆17Mar 6, 2023Updated 3 years ago
ccoreilly / deepspeech-catala
View on GitHub
Deepspeech ASR Model for the Catalan Language
☆17Feb 15, 2021Updated 5 years ago
MichaelMoroz / ShaderToy2CPP
View on GitHub
a close enough approximation of the shadertoy framework
☆12Jul 2, 2020Updated 6 years ago
Lhx94As / E2E-language-diarization
View on GitHub
Source code of paper <End-to-End Language Diarization for Bilingual Code-switching Speech>
☆19Jan 23, 2022Updated 4 years ago
Open-Speech-EkStep / data-acquisition-pipeline
View on GitHub
☆18Apr 28, 2021Updated 5 years ago