Tencent-Hunyuan / HY-MT
View external linksLinks

☆474

Alternatives and similar repositories for HY-MT

Users that are interested in HY-MT are comparing it to the libraries listed below

Sorting:

snowflakewang / AniX
View on GitHub
Animate Any Character in Any World
☆88Jan 9, 2026Updated last month
SOTAMak1r / VINO-code
View on GitHub
A Unified Visual Generator with Interleaved OmniModal Context
☆180Updated this week
XiaokunSun / MorphAny3D
View on GitHub
Official repo of "MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing“
☆77Jan 5, 2026Updated last month
danielesteban / imgpocket
View on GitHub
Encode/Decode magic data pockets inside images
☆15Sep 18, 2023Updated 2 years ago
linkedlist771 / DeMark-World
View on GitHub
A Universal Framework for AI Video Watermark Removal
☆49Dec 5, 2025Updated 2 months ago
zai-org / z-ai-sdk-java
View on GitHub
Java SDK for Z.ai Open Platform
☆43Feb 2, 2026Updated last week
XiaomiMiMo / MiMo-Audio-Eval
View on GitHub
☆77Sep 25, 2025Updated 4 months ago
camenduru / AniPortrait-vid2vid-replicate
View on GitHub
☆19Mar 27, 2024Updated last year
yichuanH / GaMO_official
View on GitHub
☆66Jan 12, 2026Updated last month
argmaxinc / OpenBench
View on GitHub
Open-source reproducible benchmarks from Argmax
☆77Jan 19, 2026Updated 3 weeks ago
rock-biter / ice-trails
View on GitHub
☆28Jun 4, 2025Updated 8 months ago
JinjieNi / OpenMoE2
View on GitHub
The official repo for "OpenMoE 2: Sparse Diffusion Language Models".
☆52Dec 28, 2025Updated last month
ysharma3501 / LuxTTS
View on GitHub
A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.
☆694Jan 28, 2026Updated 2 weeks ago
arthur-qiu / HiStream
View on GitHub
☆131Dec 24, 2025Updated last month
emjay73 / InfCam
View on GitHub
☆86Feb 4, 2026Updated last week
zai-org / GLM-TTS
View on GitHub
GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning
☆923Dec 17, 2025Updated last month
ykk648 / face_power
View on GitHub
Face_lib separate from AI_Power
☆28Nov 10, 2025Updated 3 months ago
KlingTeam / SVG-T2I
View on GitHub
Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".
☆132Dec 18, 2025Updated last month
FunAudioLLM / Fun-Audio-Chat
View on GitHub
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
☆835Jan 29, 2026Updated 2 weeks ago
liu-qingyuan / faster_whisper_gradio
View on GitHub
Real time faster whisper gradio
☆25Aug 17, 2025Updated 5 months ago
sidhantls / lexpod-speaker-prediction
View on GitHub
Speaker prediction for captions on the Lex Fridman podcast
☆27Feb 14, 2024Updated 2 years ago
Narsil / hf-chat
View on GitHub
☆27Dec 13, 2024Updated last year
Tencent-Hunyuan / Hunyuan-0.5B
View on GitHub
☆53Aug 5, 2025Updated 6 months ago
instant-high / wav2lip-onnx
View on GitHub
simple and fast wav2lip using onnx models for face-detection and inference. Easy installation
☆28Oct 14, 2024Updated last year
x1aoqv / DSRE---Digital-Sound-Resolution-Enhancer
View on GitHub
High-speed batch audio enhancer that restores high-frequency details like Sony DSEE HX, converting any audio file to Hi-Res.
☆44Sep 7, 2025Updated 5 months ago
GiantAILab / YingVideo-MV
View on GitHub
☆76Dec 8, 2025Updated 2 months ago
tomer9080 / CarelessWhisper-Streaming
View on GitHub
Causal streaming adaptation of OpenAI Whisper for real-time transcription on small audio chunks.
☆62Sep 18, 2025Updated 4 months ago
Tencent-Hunyuan / HunyuanImage-2.1
View on GitHub
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
☆672Oct 14, 2025Updated 4 months ago
HumanAIGC / chat-anyone
View on GitHub
project page for ChatAnyone
☆116Mar 28, 2025Updated 10 months ago
QwenLM / Qwen3-ASR
View on GitHub
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music…
☆1,489Jan 30, 2026Updated 2 weeks ago
Tikai7 / DiTTO-TTS
View on GitHub
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
☆35Feb 11, 2025Updated last year
tsvilans / PrePoMax
View on GitHub
Forked from https://gitlab.com/MatejB/PrePoMax
☆12Jan 8, 2024Updated 2 years ago
MiniMax-AI / MiniMax-Provider-Verifier
View on GitHub
MiniMax-Provider-Verifier offers a rigorous, vendor-agnostic way to verify whether third-party deployments of the Minimax M2 model are co…
☆23Jan 15, 2026Updated last month
kyutai-labs / nanoGPTaudio
View on GitHub
Code for the blog "Neural audio codecs: how to get audio into LLMs"
☆151Oct 20, 2025Updated 3 months ago
AutoArk / GPA
View on GitHub
[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny 300M model!
☆86Jan 29, 2026Updated 2 weeks ago
NON906 / chara-searcher
View on GitHub
This is a repository for "character images search" from image and tags.
☆37Sep 17, 2025Updated 4 months ago
JavisVerse / JavisGPT
View on GitHub
[NeurIPS'25 Spotlight] Official implementation of "JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation"
☆70Jan 10, 2026Updated last month
AIDC-AI / Ovis
View on GitHub
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
☆1,430Sep 22, 2025Updated 4 months ago
shuaijiang / Whisper-Finetune
View on GitHub
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…
☆312Dec 22, 2025Updated last month

Tencent-Hunyuan / HY-MTView external linksLinks

Alternatives and similar repositories for HY-MT

Tencent-Hunyuan / HY-MT
View external linksLinks