Exquisite video generation
☆15Feb 18, 2024Updated 2 years ago
Alternatives and similar repositories for OpenSora
Users that are interested in OpenSora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- ☆56Jul 17, 2023Updated 2 years ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆14Jun 16, 2023Updated 2 years ago
- Text-to-text alignment algorithm for speech recognition error analysis.☆29Apr 6, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 4 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated 3 months ago
- C++ neural network library☆13Jul 2, 2016Updated 9 years ago
- ☆10Jun 4, 2016Updated 9 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆38Apr 30, 2026Updated 3 weeks ago
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- ASR, End-to-End, end2end, Speech Recognition, 端到端语音识别☆12Oct 25, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆11Nov 6, 2024Updated last year
- ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…☆44Dec 17, 2020Updated 5 years ago
- 大量の音声データから笑い声部分を集めるやつ☆12May 23, 2024Updated 2 years ago
- ☆14Nov 22, 2022Updated 3 years ago
- Codes for paper <InteL-VAEs: Adding Inductive Biases to VariationalAuto-Encoders via Intermediary Latents>.☆18Jun 25, 2021Updated 4 years ago
- [ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation☆13Aug 2, 2023Updated 2 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Oct 12, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆14Aug 16, 2023Updated 2 years ago
- Neural model for prediction of stress position in Russian words☆13Jun 22, 2025Updated 11 months ago
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- JSGF Deducer based on JSGF grammar and WFST☆11Jan 11, 2018Updated 8 years ago
- 60k hours of phoneme-aligned audio from audio books☆19Jul 27, 2024Updated last year
- Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021☆41Jul 17, 2021Updated 4 years ago
- Fast and differentiable time domain all-pole filter in PyTorch.☆70Feb 5, 2026Updated 3 months ago
- Denoising autoencoders for speaker identification on MCE 2018 challenge☆12Nov 8, 2018Updated 7 years ago
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.☆230Mar 20, 2026Updated 2 months ago
- We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…☆20May 20, 2025Updated last year
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Jun 23, 2024Updated last year
- Github page for the preprint paper "InfoCatVAE: Representation Learning with Categorical Variational Autoencoders"☆14Oct 23, 2020Updated 5 years ago
- 高通AR的demo☆14Nov 25, 2016Updated 9 years ago
- Multi-lingual AudioCaps☆14Nov 20, 2023Updated 2 years ago
- 🔥 OpenGrok RESTful interface for Emacs 🔥☆14Jan 7, 2025Updated last year