OpenSoraAI/OpenSora

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenSoraAI/OpenSora)

OpenSoraAI / OpenSora

Exquisite video generation

☆15

Alternatives and similar repositories for OpenSora

Users that are interested in OpenSora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
ArwenFeng / tacotron_mandarin
View on GitHub
Train tacotron on a mandarin dataset
☆18May 6, 2019Updated 7 years ago
TomJwYu / WenetSpeechSpeakerCluster
View on GitHub
☆55Jul 17, 2023Updated 3 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
VITA-Group / Audio-Lottery
View on GitHub
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…
☆32Apr 8, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
atosystem / SSL_Interface
View on GitHub
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆16Nov 19, 2024Updated last year
revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
tonnetonne814 / PL-Bert-VITS2
View on GitHub
VITS2 using Phoneme-Level Japanese BERT
☆14Dec 17, 2023Updated 2 years ago
MontrealCorpusTools / kalpy
View on GitHub
Pybind11 bindings for Kaldi
☆15Jul 11, 2026Updated 2 weeks ago
zyascend / End-to-End-Speech-Recognition-Learning
View on GitHub
ASR, End-to-End, end2end, Speech Recognition, 端到端语音识别
☆12Oct 25, 2020Updated 5 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
kaishengyao / cnn
View on GitHub
C++ neural network library
☆13Jul 2, 2016Updated 10 years ago
mohamedirfansh / Flow
View on GitHub
🎓 The chrome extension to make learning from YouTube faster & easier.
☆11Jan 9, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
anushabala / deep-playlist
View on GitHub
☆10Jun 4, 2016Updated 10 years ago
michaelneri / unsupervised-audio-anomaly-detection
View on GitHub
Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …
☆11Nov 6, 2024Updated last year
NingMiao / InteL-VAEs
View on GitHub
Codes for paper <InteL-VAEs: Adding Inductive Biases to VariationalAuto-Encoders via Intermediary Latents>.
☆18Jun 25, 2021Updated 5 years ago
ajinkyakulkarni14 / ERISHA
View on GitHub
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…
☆44Dec 17, 2020Updated 5 years ago
litagin02 / laughter-collector
View on GitHub
大量の音声データから笑い声部分を集めるやつ
☆14May 23, 2024Updated 2 years ago
haoheliu / ontology-aware-audio-tagging
View on GitHub
☆14Nov 22, 2022Updated 3 years ago
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
giovana-morais / steme
View on GitHub
[ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation
☆13Aug 2, 2023Updated 2 years ago
Koziev / StressModel
View on GitHub
Neural model for prediction of stress position in Russian words
☆13Jun 22, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
frozentoad9 / CMST
View on GitHub
Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages
☆13Oct 12, 2022Updated 3 years ago
TTS-Research / PEL-TTS
View on GitHub
☆14Aug 16, 2023Updated 2 years ago
BiometricVox / DAE_SpeakerID
View on GitHub
Denoising autoencoders for speaker identification on MCE 2018 challenge
☆12Nov 8, 2018Updated 7 years ago
lmaxwell / McHuo
View on GitHub
A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes
☆12Oct 19, 2023Updated 2 years ago
hymanhsu / JSGFDeducer
View on GitHub
JSGF Deducer based on JSGF grammar and WFST
☆11Jan 11, 2018Updated 8 years ago
ex3ndr / supervoice-librilight-preprocessed
View on GitHub
60k hours of phoneme-aligned audio from audio books
☆19Jul 27, 2024Updated 2 years ago
Wendison / FCL-taco2
View on GitHub
Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021
☆41Jul 17, 2021Updated 5 years ago
jatinhemnani01 / extension-with-svelte
View on GitHub
☆10Dec 12, 2023Updated 2 years ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
sarulab-speech / ml-audiocaps
View on GitHub
Multi-lingual AudioCaps
☆14Nov 20, 2023Updated 2 years ago
interactiveaudiolab / ppgs
View on GitHub
High-Fidelity Neural Phonetic Posteriorgrams
☆124Feb 23, 2025Updated last year
DiffAPF / torchlpc
View on GitHub
Fast and differentiable time domain all-pole filter in PyTorch.
☆72Feb 5, 2026Updated 5 months ago
l123456789jy / AR
View on GitHub
高通AR的demo
☆14Nov 25, 2016Updated 9 years ago
zhangmeishan / NNTranJSTagger
View on GitHub
☆12Oct 9, 2018Updated 7 years ago
KranthiKumarR / Localize-to-Binauralize
View on GitHub
Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)
☆10Oct 11, 2021Updated 4 years ago
leavelet / singing-database-maker
View on GitHub
AI based singing voice synthesis database generator
☆13Aug 12, 2022Updated 3 years ago