Visual Generation Tuning
☆100Apr 16, 2026Updated 2 weeks ago
Alternatives and similar repositories for VGT
Users that are interested in VGT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"☆54Jul 5, 2025Updated 10 months ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 4 months ago
- [ICLR 2026 🔥 ] Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"☆148Jan 26, 2026Updated 3 months ago
- Towards Scalable Pre-training of Visual Tokenizers for Generation☆475Apr 15, 2026Updated 2 weeks ago
- EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]☆142Apr 11, 2026Updated 3 weeks ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆134Jan 30, 2026Updated 3 months ago
- official implementation of the paper "Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability".☆55Dec 25, 2025Updated 4 months ago
- ☆153Mar 18, 2026Updated last month
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆33Dec 22, 2025Updated 4 months ago
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆30Mar 25, 2026Updated last month
- A solutions manual for Introduction to Set Theory by Hrbacek and Jech☆11Aug 30, 2024Updated last year
- ☆30Oct 26, 2025Updated 6 months ago
- ThinkGen: Generalized Thinking for Visual Generation☆52Dec 30, 2025Updated 4 months ago
- [ACL 2026] From Word to World: Can Large Language Models be Implicit Text-based World Models?☆62Apr 13, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A code base for the official XS-VID dataset baseline method YOLOFT☆20Dec 24, 2024Updated last year
- [AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices☆116Nov 30, 2025Updated 5 months ago
- An hardware-aware Efficient Implementation for "Mixture-of-Depths Attention".☆255Apr 21, 2026Updated 2 weeks ago
- ☆43Mar 27, 2026Updated last month
- Create PDF animations from graphics files and inline graphics using LaTeX☆12Jun 8, 2018Updated 7 years ago
- [ICCV 2025] FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models☆19Aug 26, 2025Updated 8 months ago
- VideoGPA is a self-supervised framework that enhances 3D consistency in Video Diffusion Models.☆51Apr 17, 2026Updated 2 weeks ago
- PyTorch implementation of NEPA☆332Feb 9, 2026Updated 2 months ago
- ☆94May 15, 2025Updated 11 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This is the project for 'USG'.☆38Apr 7, 2025Updated last year
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆145Nov 4, 2025Updated 6 months ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆69Apr 4, 2026Updated last month
- [NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning☆235Apr 17, 2026Updated 2 weeks ago
- ☆11Jan 18, 2025Updated last year
- [JAG 2026] DreamCD: A change-label-free framework for change detection via a weakly conditional semantic diffusion model in optical VHR i…☆25Jan 30, 2026Updated 3 months ago
- ☆52Jun 13, 2025Updated 10 months ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆40May 26, 2025Updated 11 months ago
- [CVPR 2026 Highlight] A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens☆98Apr 21, 2026Updated 2 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- UniVid: The Open-Source Unified Video Model☆32Oct 13, 2025Updated 6 months ago
- Sign Agnostic Learning with Derivatives☆14Jan 24, 2022Updated 4 years ago
- Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".☆148Dec 18, 2025Updated 4 months ago
- The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".☆12Oct 17, 2023Updated 2 years ago
- The essential tasks on the command line interface for web developers.☆38Nov 3, 2023Updated 2 years ago
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆24Apr 13, 2026Updated 3 weeks ago
- Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization☆64Sep 19, 2025Updated 7 months ago