☆773Jun 1, 2026Updated last week
Alternatives and similar repositories for Hy-MT
Users that are interested in Hy-MT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🙊Cogified speech-to-text model nvidia/canary-qwen-2.5b (best ASR model according to hf-audio/open_asr_leaderboard as of 18/Jul/2025)🎙️☆21Jul 28, 2025Updated 10 months ago
- [arXiv 2512.17796] Animate Any Character in Any World☆96Mar 10, 2026Updated 3 months ago
- Official implemtation of UniverSR (ICASSP 2026)☆50Apr 9, 2026Updated 2 months ago
- ☆710Dec 30, 2025Updated 5 months ago
- A Unified Visual Generator with Interleaved OmniModal Context☆227Mar 5, 2026Updated 3 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Local AI runtime for training & running small LLMs directly on Apple Neural Engine (ANE). No CoreML. No Metal. Offline, on-device fine-tu…☆97Mar 6, 2026Updated 3 months ago
- ☆35Oct 23, 2025Updated 7 months ago
- Causal streaming adaptation of OpenAI Whisper for real-time transcription on small audio chunks.☆74Mar 31, 2026Updated 2 months ago
- GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning☆1,017Apr 10, 2026Updated 2 months ago
- ☆41May 12, 2026Updated last month
- ☆72Jan 12, 2026Updated 5 months ago
- HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation☆672Oct 14, 2025Updated 8 months ago
- ☆87Sep 25, 2025Updated 8 months ago
- 🌋LavaSR: Fast Speech restoration and enhancement☆549Jun 5, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆58Dec 28, 2025Updated 5 months ago
- ☆67May 7, 2026Updated last month
- Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.☆966Feb 27, 2026Updated 3 months ago
- Website for CSE 234, Winter 2025☆15Mar 24, 2025Updated last year
- Official implementation of AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories☆92Feb 17, 2026Updated 3 months ago
- [CVPR 2026] Official repo of "MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing“☆110Apr 13, 2026Updated 2 months ago
- Java SDK for Z.ai Open Platform☆59Jun 4, 2026Updated last week
- https://little-misfit.github.io/GRAG-Image-Editing/☆119Nov 27, 2025Updated 6 months ago
- ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation☆118Dec 11, 2025Updated 6 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Internal utility libraries for Pkl☆16Jun 4, 2026Updated last week
- A framework aiming to bridge fast robot prototyping, predefined motion primitives, heterogeneous teleoperation, data collection, and flex…☆27Apr 4, 2026Updated 2 months ago
- Baidu Qianfan Deep Research☆34Jun 8, 2026Updated last week
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆89May 13, 2026Updated last month
- GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters☆809Mar 6, 2026Updated 3 months ago
- ☆71May 2, 2026Updated last month
- Simple, Efficient, and Effective Negative Guidance in Few-Step Image Generation Models By Value Sign Flip☆38Jan 27, 2026Updated 4 months ago
- SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations (CVPR 2026 Findings)☆990May 6, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆18Oct 19, 2024Updated last year
- Multilingual and Multiculture Benchmark and LLM☆40May 18, 2026Updated 3 weeks ago
- 一个基于 GUI 的博客发布系统,支持一键发布到多个平台,包括 51CTO、博客园、简书、掘金和 CSDN。该工具旨在简化内容创作者跨平台分享内容的流程。☆51Nov 18, 2024Updated last year
- A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics…☆926Apr 9, 2026Updated 2 months ago
- [Arxiv 2025] Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder…☆152Dec 18, 2025Updated 5 months ago
- tampermonkey-script☆20Feb 27, 2026Updated 3 months ago
- Brand new TTS solution☆11Dec 7, 2024Updated last year