sanchit-gandhi/seq2seq-speech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sanchit-gandhi/seq2seq-speech)

sanchit-gandhi / seq2seq-speech

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

☆39

Alternatives and similar repositories for seq2seq-speech

Users that are interested in seq2seq-speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

besacier / ASR2022
View on GitHub
☆57Dec 19, 2022Updated 3 years ago
ga642381 / Taiwanese-Speech-Synthesis
View on GitHub
Taiwanese Speech Synthesis with Tacotron2
☆26Oct 2, 2022Updated 3 years ago
VE-FORBRYDERNE / mesh-transformer-jax
View on GitHub
Fork of kingoflolz/mesh-transformer-jax with memory usage optimizations and support for GPT-Neo, GPT-NeoX, BLOOM, OPT and fairseq dense L…
☆22Nov 14, 2022Updated 3 years ago
Hannes1 / react-native-wenet
View on GitHub
Wenet speech to text for react native
☆10Nov 1, 2022Updated 3 years ago
Beomi / exbert-transformers
View on GitHub
exBERT on Transformers🤗
☆10Jun 14, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
patrickvonplaten / Wav2Vec2_PyCTCDecode
View on GitHub
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode
☆110Aug 31, 2022Updated 3 years ago
MTG / Podcastmix
View on GitHub
PodcastMix A dataset for separating music and speech in podcasts.
☆44Aug 20, 2024Updated last year
GussailRaat / NAACL-19-CIM
View on GitHub
Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis
☆13Mar 17, 2021Updated 5 years ago
AI-Hypercomputer / ray-tpu
View on GitHub
☆15May 11, 2025Updated last year
vtuber-plan / vcvits
View on GitHub
Non Parallel Voice Conversion based on VITS
☆24Mar 31, 2023Updated 3 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
microsoft / dstoolkit-km-solution-accelerator
View on GitHub
Data Science Toolkit - Knowledge Mining Solution Accelerator
☆23Mar 18, 2026Updated 4 months ago
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
ainativehealth / GoodMedicalCoder
View on GitHub
☆12Sep 21, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
customink / lambda-python-nltk-layer
View on GitHub
Lambda layer to enable using famous NLTK python package with AWS lambda
☆10Mar 21, 2024Updated 2 years ago
anton-l / wav2vec-toolkit
View on GitHub
A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models
☆30Apr 21, 2021Updated 5 years ago
jimbuck / WorkHive
View on GitHub
Lightweight, Browser-based, Grid Computing platform for Node.js
☆12Mar 21, 2015Updated 11 years ago
kamilakesbi / DiarizersLM
View on GitHub
☆15Jul 16, 2024Updated 2 years ago
mssun / cuhk-beamer
View on GitHub
Beamer template with CUHK colors.
☆14Jun 8, 2015Updated 11 years ago
bagustris / ssl-ser
View on GitHub
Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"
☆10Mar 15, 2023Updated 3 years ago
webaudiomodules / sdk
View on GitHub
☆23Jul 26, 2024Updated last year
jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
patrickvonplaten / audio-gen-dreambooth
View on GitHub
☆23Jun 13, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
sandersn / nltk
View on GitHub
NLTK ported to Javascript
☆13Sep 3, 2017Updated 8 years ago
George0828Zhang / simulst
View on GitHub
PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.
☆25Oct 3, 2022Updated 3 years ago
cmpute / audio-codec-benchmark
View on GitHub
Comprehensive quantitative comparison of lossless and lossy audio codecs
☆41Feb 11, 2023Updated 3 years ago
amazon-science / proteno
View on GitHub
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…
☆45May 25, 2021Updated 5 years ago
gauthelo / kallaama-speech-dataset
View on GitHub
A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.
☆20Mar 26, 2026Updated 3 months ago
lwang114 / GraphUnsupASR
View on GitHub
☆10Apr 17, 2024Updated 2 years ago
slp-rl / SLM-Discrete-Representations
View on GitHub
This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…
☆20Jan 3, 2023Updated 3 years ago
EPCCed / eidf-docs
View on GitHub
EIDF Services Documentation
☆20Updated this week
voxeet / comms-sdk-cpp
View on GitHub
The Dolby.io Communications C++ SDK provides both Client and Server applications the ability to create HD voice and video for fully immer…
☆13Aug 30, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
microsoft / e2tts-test-suite
View on GitHub
☆32Jul 18, 2024Updated 2 years ago
Celemony / ARA_Examples
View on GitHub
Examples demonstrating proper usage of the ARA Audio Random Access API
☆15Jul 11, 2026Updated last week
wangcongcong123 / ttt
View on GitHub
A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+
☆37Mar 10, 2021Updated 5 years ago
dobby-seo / korean-speech-recognition-quartznet
View on GitHub
Jasper 기반 양자화된 모델인 Quartznet 한국어 음성인식
☆22Jul 21, 2021Updated 5 years ago
drscotthawley / fad_pytorch
View on GitHub
Frechet Audio Distance evaluation in PyTorch
☆36Jun 9, 2023Updated 3 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
DorBernsohn / CodeLM
View on GitHub
A repo for code based language models
☆18Feb 10, 2021Updated 5 years ago