☆172Apr 16, 2026Updated last week
Alternatives and similar repositories for Audio-Omni
Users that are interested in Audio-Omni are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆39Dec 24, 2025Updated 4 months ago
- Training code for MaskGCT-T2S model.☆24Dec 14, 2024Updated last year
- ☆26Jul 6, 2022Updated 3 years ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 7 months ago
- An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.☆240Feb 26, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆44Oct 28, 2024Updated last year
- [CVPR 2025] Pytorch implementation of the paper "Hearing Anywhere in Any Environment"☆31Sep 18, 2025Updated 7 months ago
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆24Sep 9, 2024Updated last year
- Material UI 组件封装☆20Apr 26, 2024Updated 2 years ago
- A Python library designed to accelerate Perturbed Substructure Optimization using Genetic Algorithms.☆51Mar 7, 2026Updated last month
- 易接入的性能监控SDK☆22Feb 17, 2023Updated 3 years ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆36Jan 28, 2026Updated 3 months ago
- ☆34Sep 15, 2025Updated 7 months ago
- ☆28Apr 10, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations☆22Dec 24, 2025Updated 4 months ago
- [ICLR2026] WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction☆69Sep 3, 2025Updated 7 months ago
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…☆77Jan 25, 2026Updated 3 months ago
- ☆55Mar 2, 2023Updated 3 years ago
- Adaptive Vocoder for Custom Voice☆61Sep 22, 2022Updated 3 years ago
- Audio-FLAN☆160Sep 23, 2025Updated 7 months ago
- [ICLR2026] Video-GPT via Next Clip Diffusion.☆45Jun 2, 2025Updated 10 months ago
- APOLLUMIA is an ERC-20 token implemented on the Ethereum blockchain. It incorporates transaction tax mechanisms, anti-bot protections, an…☆32Apr 28, 2025Updated last year
- Source code for "Modulation Extraction for LFO-driven Audio Effects".☆32Mar 25, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Zero-Shot Blind Audio Bandwidth Extension☆27May 25, 2023Updated 2 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- A vision-language model with an improved cross-attention mechanism for scalable streaming inference☆29Mar 9, 2026Updated last month
- ☆11Oct 8, 2020Updated 5 years ago
- Desktop tools, Quickly open files, applications, projects, folders, etc.☆33Aug 20, 2023Updated 2 years ago
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 10 months ago
- ☆30Feb 4, 2021Updated 5 years ago
- Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"☆110Dec 20, 2025Updated 4 months ago
- [IROS 2024] SCANet: Correcting LEGO Assembly Errors with Self-Correct Assembly Network (FINALIST BEST APPLICATION PAPER)☆23Oct 26, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆46Nov 21, 2025Updated 5 months ago
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners☆155Jul 6, 2024Updated last year
- ☆29Mar 10, 2026Updated last month
- "Stochasticity in Neural ODEs: An Empirical Study". Experiments from the paper☆13Apr 27, 2020Updated 6 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Nov 1, 2024Updated last year
- DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.☆19Nov 23, 2021Updated 4 years ago
- 企业级多模块 Vue SPA 架构,适用于 pc、mobile 环境☆12Jan 16, 2023Updated 3 years ago