alicank/Translation-Augmented-LibriSpeech-Corpus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alicank/Translation-Augmented-LibriSpeech-Corpus)

alicank / Translation-Augmented-LibriSpeech-Corpus

Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and contains English utterances (from audiobooks) automatically aligned with French text. Our dataset offers ~236h of speech aligned to translated text.

☆44

Alternatives and similar repositories for Translation-Augmented-LibriSpeech-Corpus

Users that are interested in Translation-Augmented-LibriSpeech-Corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joshua-decoder / fisher-callhome-corpus
View on GitHub
The Fisher and CALLHOME Spanish–English Speech Translation Corpus
☆41Feb 10, 2022Updated 4 years ago
sheffieldnlp / deepQuest
View on GitHub
Framework for neural-based Quality Estimation
☆41Sep 23, 2020Updated 5 years ago
ImperialNLP / MMT-Delib
View on GitHub
☆10Dec 21, 2022Updated 3 years ago
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
mattiadg / FBK-Fairseq-ST
View on GitHub
An adaptation of Fairseq to (End-to-end) speech translation.
☆22Jun 1, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
ustctf-zz / delibnet
View on GitHub
☆14Nov 16, 2022Updated 3 years ago
neulab / xnmt
View on GitHub
eXtensible Neural Machine Translation
☆189Sep 22, 2025Updated 10 months ago
Aisaka0v0 / TS-Whisper
View on GitHub
☆33Jun 12, 2025Updated last year
isl-mt / SLT.KIT
View on GitHub
Spoken Language Translation System
☆20Jul 26, 2021Updated 5 years ago
bzhangGo / zero
View on GitHub
Zero -- A neural machine translation system
☆152May 8, 2023Updated 3 years ago
datamllab / autokeras-algorithm
View on GitHub
Some other AutoML algorithms as baselines.
☆12Apr 2, 2019Updated 7 years ago
Blank-Wang / DCASE2018-Task4
View on GitHub
Weakly Supervised CRNN System for Sound Event Detection With Large-scale Unlabeled In-domain Data
☆11Oct 31, 2018Updated 7 years ago
EMRAI / emrai-synthetic-diarization-corpus
View on GitHub
☆22Sep 24, 2018Updated 7 years ago
kahne / SpeechTransProgress
View on GitHub
Tracking the progress in end-to-end speech translation
☆260Oct 25, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CyberZHG / keras-layer-normalization
View on GitHub
Layer normalization implemented in Keras
☆60Jan 22, 2022Updated 4 years ago
Mashiro009 / slidespeech_dl
View on GitHub
☆24Sep 20, 2024Updated last year
HuangZiliAndy / SSL_for_multitalker
View on GitHub
ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS
☆33Mar 16, 2023Updated 3 years ago
srvk / how2-dataset
View on GitHub
This repository contains code and metadata of How2 dataset
☆192Dec 30, 2024Updated last year
bytedance / neurst
View on GitHub
Neural end-to-end Speech Translation Toolkit
☆306Jun 28, 2022Updated 4 years ago
icoxfog417 / tying-wv-and-wc
View on GitHub
Implementation for "Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling"
☆38Aug 8, 2017Updated 8 years ago
alistairewj / icu-model-transfer
View on GitHub
Evaluating methods to improve model transfer for intensive care unit models
☆16Jul 6, 2023Updated 3 years ago
alex-berard / seq2seq
View on GitHub
Attention-based sequence to sequence learning
☆388May 9, 2019Updated 7 years ago
lovecambi / qebrain
View on GitHub
machine translation and quality estimation
☆35Jan 13, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
karlstratos / ammi
View on GitHub
☆11Jul 15, 2020Updated 6 years ago
LinguisticAnomalies / harmonized-toolkit
View on GitHub
Toolkit for Reproducible Execution of Speech, Text and Language Experiments
☆10Mar 24, 2026Updated 4 months ago
30stomercury / hmm-backprop
View on GitHub
Fast and differentiable hidden Markov model in C++
☆19Jan 20, 2023Updated 3 years ago
wbbeyourself / arxiv_paper_downloader
View on GitHub
Arxiv daily paper downloader and manage papers with markdown preview.
☆41Jul 16, 2024Updated 2 years ago
forgi86 / lru-reduction
View on GitHub
Python code of the paper Model order reduction of deep structured state-space models: A system-theoretic approach
☆14Nov 22, 2024Updated last year
ImperialNLP / pysimt
View on GitHub
Simultaneous NMT/MMT framework in PyTorch
☆38Mar 22, 2025Updated last year
LingweiMeng / Whisper-Sidecar
View on GitHub
The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".
☆34Aug 2, 2025Updated 11 months ago
hsd1503 / PIC_mortality
View on GitHub
Predicting In-hospital Mortality of Patients in the Pediatric ICU
☆16Feb 7, 2026Updated 5 months ago
jvbalen / sample_100
View on GitHub
A dataset of Hip Hop samples for Music Information Retrieval research
☆11Jun 1, 2016Updated 10 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cuhealthybrains / MT-LLM
View on GitHub
The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"
☆51Apr 7, 2025Updated last year
Victorwz / tod_as_nlg
View on GitHub
Official implementation of SIGIR 2022 Paper "Task-Oriented Dialogue System as Natural Language Generation".
☆14Apr 6, 2022Updated 4 years ago
facebookresearch / covost
View on GitHub
CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed)
☆401Sep 14, 2021Updated 4 years ago
WuZhuoran / Plant_Seedlings_Classification
View on GitHub
Kaggle Competition Project as well as ANLY 590 Final Project. Task: Determine the species of a seedling from an image
☆16Dec 17, 2018Updated 7 years ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 4 months ago
uiwjs / next-remove-imports
View on GitHub
The default behavior is to remove all .less/.css/.scss/.sass/.styl imports from all packages in node_modules.
☆17Apr 21, 2026Updated 3 months ago
wns823 / NMT_SSP
View on GitHub
NMT with ssp
☆11Oct 28, 2021Updated 4 years ago