liamdugan/speech-to-speech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/liamdugan/speech-to-speech)

liamdugan / speech-to-speech

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

☆32

Alternatives and similar repositories for speech-to-speech

Users that are interested in speech-to-speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ictnlp / DST
View on GitHub
DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently
☆11Jun 6, 2024Updated 2 years ago
liamdugan / human-detection
View on GitHub
Code for the AAAI 2023 Paper "Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Gene…
☆16Oct 29, 2024Updated last year
jwieting / bilingual-generative-transformer
View on GitHub
Code for "A Bilingual Generative Transformer for Semantic Sentence Embedding" published at EMNLP 2020.
☆10Nov 20, 2020Updated 5 years ago
ictnlp / FastLongSpeech
View on GitHub
FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech process…
☆16Jul 22, 2025Updated last year
ictnlp / ITST
View on GitHub
Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"
☆13Nov 3, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
OSU-STARLAB / Simul-LLM
View on GitHub
[ACL 2024] An easily extensible framework for simultaneous, text-to-text neural machine translation (SimulMT) for LLMs.
☆18Apr 21, 2025Updated last year
ictnlp / ComSpeech
View on GitHub
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
☆27Jul 2, 2024Updated 2 years ago
ictnlp / Dual-Path
View on GitHub
Code for ACL 2022 main conference paper "Modeling Dual Read/Write Paths for Simultaneous Machine Translation"
☆12Mar 31, 2022Updated 4 years ago
skit-ai / N-Best-ASR-Transformer
View on GitHub
Code for ACL-IJCNLP 2021 paper "N-Best-ASR-Transformer: Enhancing SLU Performance using Multiple ASR Hypotheses."
☆17Nov 30, 2021Updated 4 years ago
danliu2 / caat
View on GitHub
☆35Sep 1, 2022Updated 3 years ago
Jason-Young-AI / YoungToolkit
View on GitHub
A Toolkit for a series of Young projects.
☆23Apr 30, 2021Updated 5 years ago
pshirali / workbench
View on GitHub
A hierarchical environment manager for bash, written in bash.
☆17Apr 18, 2026Updated 3 months ago
ictnlp / DiSeg
View on GitHub
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
☆37Dec 6, 2023Updated 2 years ago
ictnlp / LSG
View on GitHub
The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”
☆15Jan 3, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
titu1994 / warprnnt_numba
View on GitHub
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch
☆17Mar 11, 2022Updated 4 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
rtx-on / rtx-explore
View on GitHub
DirectX Raytracing Path Tracer
☆58Jun 23, 2022Updated 4 years ago
ictnlp / GMA
View on GitHub
Code for ACL 2022 findings paper "Gaussian Multi-head Attention for Simultaneous Machine Translation"
☆11Mar 31, 2022Updated 4 years ago
MingjieChen / EasyVC
View on GitHub
A toolkit for any-to-any encoder-decoder voice conversion systems
☆83Aug 10, 2023Updated 2 years ago
zerospeech / zerospeech2021_baseline
View on GitHub
BERT and LSTM baseline models of the ZeroSpeech Challenge 2021
☆60Oct 19, 2022Updated 3 years ago
ictnlp / SLED-TTS
View on GitHub
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
☆108May 20, 2025Updated last year
ictnlp / NAST-S2x
View on GitHub
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆78Oct 22, 2024Updated last year
vipul-sharma20 / nvim-jira
View on GitHub
A Neovim Jira plugin
☆11Apr 20, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ltbringer / swiggy-order
View on GitHub
Order food via terminal.
☆15Dec 29, 2020Updated 5 years ago
ictnlp / FA-DAT
View on GitHub
Official Implementation for the ICLR2023 paper "Fuzzy Alignments in Directed Acyclic Graph for Non-autoregressive Machine Translation"
☆14Mar 1, 2023Updated 3 years ago
Maknee / PennOS-hardware-wrapper
View on GitHub
Provides a wrapper to boot your penn-os on hardware!
☆10Dec 7, 2017Updated 8 years ago
wavlab-speech / cmu_multilingual_speech
View on GitHub
CMU multilingual speech repository
☆30Apr 15, 2022Updated 4 years ago
denisvieriu / Portable-Executable-Minifilter-Driver
View on GitHub
☆14Mar 28, 2018Updated 8 years ago
martin-hughes / project_azalea
View on GitHub
A future hobby OS kernel
☆11Nov 8, 2020Updated 5 years ago
b-flo / warp-transducer
View on GitHub
A fast parallel implementation of RNN Transducer.
☆12Apr 8, 2025Updated last year
ictnlp / PCFG-NAT
View on GitHub
Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".
☆12Jan 4, 2024Updated 2 years ago
mit-ccc / acl-nuse-personal-narratives
View on GitHub
Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types -- Supplementary inf…
☆12Jul 14, 2020Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
StonyBrookNLP / tellmewhy
View on GitHub
Website for release of TellMeWhy dataset for why question answering
☆14Nov 11, 2022Updated 3 years ago
xl8-ai / WordSiMT
View on GitHub
Official implementation of EMNLP 2023 Findings paper "Enhanced Simultaneous Machine Translation with Word-level Policies"
☆18Apr 10, 2026Updated 3 months ago
simplc / WCN-BERT
View on GitHub
Jointly encoding word confusion networks (WCNs) and dialogue context with BERT for spoken language understanding (SLU).
☆12Jun 12, 2023Updated 3 years ago
RaphaelOlivier / whisper_attack
View on GitHub
☆23Apr 3, 2025Updated last year
ictnlp / DASpeech
View on GitHub
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
☆63Jul 22, 2024Updated 2 years ago
HKUST-KnowComp / AbsPyramid
View on GitHub
Official code repository for the paper: AbsPyramid: Benchmarking the Abstration Ability of Language Models with a Unified Entailment Grap…
☆13Oct 30, 2024Updated last year
albertfgu / diffwave-sashimi
View on GitHub
Implementation of DiffWave and SaShiMi audio generation models
☆128Apr 4, 2023Updated 3 years ago