☆41Oct 29, 2025Updated 4 months ago
Alternatives and similar repositories for ml-sid-dit
Users that are interested in ml-sid-dit are comparing it to the libraries listed below
Sorting:
- ☆44Mar 12, 2026Updated last week
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆21Dec 2, 2025Updated 3 months ago
- [CVPR 2026] Official repo for "EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation"☆37Mar 13, 2026Updated last week
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 2 months ago
- poorman's ar-dit tts☆45Dec 31, 2025Updated 2 months ago
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆36Oct 29, 2025Updated 4 months ago
- RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing☆59Dec 26, 2025Updated 2 months ago
- ☆11Apr 16, 2023Updated 2 years ago
- Demo of using WASM to sandbox Plotly execution☆19Mar 30, 2025Updated 11 months ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆24Feb 11, 2026Updated last month
- Official repo for paper "IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning"☆41Jan 29, 2026Updated last month
- 👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"☆33Jul 28, 2025Updated 7 months ago
- Reinforcing Text-Rich Video Reasoning with Visual Rumination☆27Nov 24, 2025Updated 3 months ago
- Code for ICCV 2023 paper ✨ "StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Mo…☆18Jan 25, 2024Updated 2 years ago
- [CVPR 2026] Official pytorch implementation of "ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding"☆21Dec 17, 2025Updated 3 months ago
- WolvCtf-2023-Challenges-Public☆12Apr 13, 2023Updated 2 years ago
- Official implementation of Log-linear Sparse Attention (LLSA).☆63Feb 2, 2026Updated last month
- Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders☆223Feb 13, 2026Updated last month
- 本课程主要介绍强化学习的基础知识,其目标是帮助同学们快速、顺利地 进入强化学习及其应用领域的研究工作。课程主要内容包含有限马尔可夫决策过程,动态规划,无模型预测与控制(SASA,Q-Learning),价值函数逼近(DQN),策略梯度方法(REINFORCE),执行者/评论者…☆18Oct 17, 2022Updated 3 years ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 4 months ago
- minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora☆40Mar 25, 2024Updated last year
- Make self forcing endless. Add cache purging. Add prompt controllability.☆70Sep 9, 2025Updated 6 months ago
- 🍑 relsim: Relational Visual Similarity | pip install relsim 🌍 (CVPR 2026)☆70Feb 21, 2026Updated 3 weeks ago
- ☆36Dec 16, 2025Updated 3 months ago
- Official Implementation of NAF: Zero-Shot Feature Upsampling via Neighborhood Attention Filtering☆71Dec 1, 2025Updated 3 months ago
- ☆40Oct 15, 2023Updated 2 years ago
- An open source implementation of CLIP☆22Nov 6, 2024Updated last year
- The HydraMesh is an open-source evolution of the DeMoD Secure Protocol (DSP), designed as a shareware version to promote community-driven…☆20Mar 12, 2026Updated last week
- Eden Flux LoRA trainer and full-finetuning☆24Mar 21, 2025Updated 11 months ago
- ☆50Feb 12, 2026Updated last month
- ☆16May 28, 2017Updated 8 years ago
- Official pytorch implementation of "AlphaFlow: Understanding and Improving MeanFlow Models"☆109Oct 24, 2025Updated 4 months ago
- 逆向代码☆26Jun 22, 2020Updated 5 years ago
- Is a high-performance Augmented Recovery-Generation (RAG) solution based on Redis, Qdrant or PostgreSQL. It offers a high-level interface…☆30Jan 6, 2026Updated 2 months ago
- ☆22Jan 22, 2026Updated last month
- Extend the Conditioning of Stable Diffusion to take Audio Embeddings Instead of Text Embeddings using Wav2Vec2-BERT model☆13Sep 25, 2024Updated last year
- Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training☆70Feb 7, 2026Updated last month
- 🐍 An SDK in Python for the Firecracker microVM API☆31Nov 6, 2025Updated 4 months ago
- official implementation of "CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusi…☆19Sep 5, 2024Updated last year