voidful/SpeechMix

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/voidful/SpeechMix)

voidful / SpeechMix

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

☆46

Alternatives and similar repositories for SpeechMix

Users that are interested in SpeechMix are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
George0828Zhang / torch_cif
View on GitHub
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…
☆37Feb 10, 2024Updated 2 years ago
voidful / NLPrep
View on GitHub
🍳 NLPrep - dataset tool for many natural language processing task
☆28Jul 30, 2021Updated 4 years ago
voidful / TFkit
View on GitHub
🤖📇 handling multiple nlp task in one pipeline
☆57Sep 18, 2025Updated 10 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Speech-Lab-IITM / CCC-wav2vec-2.0
View on GitHub
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…
☆23Mar 18, 2024Updated 2 years ago
mct10 / CoBERT
View on GitHub
Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
☆48Nov 8, 2023Updated 2 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
nervjack2 / MelHuBERT
View on GitHub
Official implementation of MelHuBERT
☆70Feb 21, 2026Updated 5 months ago
lingjzhu / spoken_sent_embedding
View on GitHub
Unsupervised spoken sentence embeddings
☆14Dec 14, 2022Updated 3 years ago
ga642381 / Taiwanese-Speech-Synthesis
View on GitHub
Taiwanese Speech Synthesis with Tacotron2
☆26Oct 2, 2022Updated 3 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
ga642381 / Taiwanese-Translation
View on GitHub
Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus
☆13Oct 15, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
voidful / FTA
View on GitHub
Technical Analysis on Cryptocurrency
☆25Oct 14, 2025Updated 9 months ago
eastonYi / Unsupervised-ASR
View on GitHub
unsupervised ASR (mainly phone classifier) using EODM and GAN
☆12Oct 22, 2020Updated 5 years ago
mechanicalsea / lighthubert
View on GitHub
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
☆73Sep 26, 2022Updated 3 years ago
JSALT-2022-SSL / superb-prosody
View on GitHub
☆31Jul 13, 2023Updated 3 years ago
Harry-Chan / seq2seqlm-on-qg
View on GitHub
☆13Feb 9, 2022Updated 4 years ago
maxidl / wav2vec2
View on GitHub
☆10Mar 29, 2021Updated 5 years ago
skit-ai / slu-prosody
View on GitHub
Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…
☆27May 17, 2023Updated 3 years ago
MingLunHan / CIF-ColDec
View on GitHub
[ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection
☆25Jul 14, 2026Updated last week
amazon-science / proteno
View on GitHub
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…
☆45May 25, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
aispeech-lab / w2v-cif-bert
View on GitHub
☆37Jun 28, 2021Updated 5 years ago
voidful / ipa2
View on GitHub
Tools for convert Text to IPA in python
☆19Feb 11, 2023Updated 3 years ago
voidful / BDG
View on GitHub
Code for "A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies."
☆27Feb 2, 2022Updated 4 years ago
jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
archiki / Robust-E2E-ASR
View on GitHub
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 20…
☆49Dec 25, 2024Updated last year
lstrgar / ss-phoneme-seg
View on GitHub
Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…
☆55Nov 4, 2022Updated 3 years ago
AlanBaade / SyllableLM
View on GitHub
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
☆63Jul 1, 2025Updated last year
slp-rl / SpokenStoryCloze
View on GitHub
A spoken version of the textual story cloze benchmark
☆22Aug 6, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
zqs01 / data2vecnoisy
View on GitHub
☆11Oct 20, 2022Updated 3 years ago
voidful / nlp2go
View on GitHub
🏃 hosting nlp models in one line
☆20May 8, 2024Updated 2 years ago
p208p2002 / albert-zh-for-pytorch-transformers
View on GitHub
轉換好的 Albert 中文模型 (for pytorch-transformers)
☆19Mar 6, 2020Updated 6 years ago
voidful / MMLM
View on GitHub
Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra
☆16Dec 10, 2024Updated last year
bagustris / ssl-ser
View on GitHub
Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"
☆10Mar 15, 2023Updated 3 years ago