ut-vision / SiMHand
[ICLR 2025] SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
☆16Updated last week
Alternatives and similar repositories for SiMHand:
Users that are interested in SiMHand are comparing it to the libraries listed below
- Official code for MotionBench (CVPR 2025)☆31Updated 3 weeks ago
- [CVPR 2025] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆40Updated 2 weeks ago
- ElasticTok: Adaptive Tokenization for Image and Video☆61Updated 4 months ago
- Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"☆76Updated 11 months ago
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆83Updated last year
- SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enh…☆30Updated 7 months ago
- [CIKM-2024] Official code for work "ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance"☆18Updated 7 months ago
- [CVPR 2025] Open implementation of "RandAR"☆65Updated this week
- Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).☆38Updated 10 months ago
- A Pytorch Implementation of Finite Scalar Quantization☆114Updated last year
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆177Updated 2 months ago
- ☆17Updated 4 months ago
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆29Updated 10 months ago
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆43Updated 9 months ago
- Awesome list of Mixture-of-Experts (MoE)☆18Updated 9 months ago
- [ICLR2025] The code of Z-Sampling, proposed in our paper "Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflectio…☆60Updated last month
- ✌ CLoG: Benchmarking Continual Learning of Image Generation Models☆17Updated 9 months ago
- The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆92Updated 5 months ago
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆121Updated 3 weeks ago
- ICLR2023 statistics☆60Updated last year
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆28Updated 4 months ago
- Unofficial implementation of "SODA: Bottleneck Diffusion Models for Representation Learning"☆84Updated last year
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆86Updated 5 months ago
- A paper list for spatial reasoning☆51Updated last month
- A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application☆259Updated last month
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆61Updated 3 weeks ago
- Collection of awesome Continual Test-Time Adaptation methods☆15Updated 9 months ago
- Denoising Diffusion Step-aware Models (ICLR2024)☆58Updated last year
- ☆54Updated last year
- Official implementation for 'Class-Balancing Diffusion Models'☆52Updated 10 months ago