tunglm2203 / erlvlmLinks
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models (ICML 2025)
☆19Updated 2 weeks ago
Alternatives and similar repositories for erlvlm
Users that are interested in erlvlm are comparing it to the libraries listed below
Sorting:
- ☆12Updated last year
- You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.☆414Updated 6 months ago
- [ICLR2025] Halton Scheduler for Masked Generative Image Transformer☆244Updated 2 weeks ago
- This is the official code release for our work, Denoising Vision Transformers.☆370Updated 8 months ago
- This is a repo to track the latest autoregressive visual generation papers.☆372Updated 3 weeks ago
- Winning SubNetwork (WSN), Fourier Subneural Operator (FSO), Video-Incremental Learning (VIL), Sequential Neural Implicit Representation (…☆25Updated 8 months ago
- [CVPR'23] Video Probabilistic Diffusion Models in Projected Latent Space☆317Updated last year
- [CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆1,017Updated last month
- [NeurIPS'23] Emergent Correspondence from Image Diffusion☆707Updated last year
- Official Implementation of paper "A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence"☆325Updated last year
- ☆503Updated 2 months ago
- Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper☆107Updated 6 months ago
- Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)☆447Updated last year
- [CVPR 2024 Highlight] Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer☆398Updated 7 months ago
- [ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation☆495Updated 8 months ago
- PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.☆433Updated last year
- Code for Scaling Language-Free Visual Representation Learning (WebSSL)☆246Updated 2 months ago
- Official Jax Implementation of MaskGIT☆522Updated 2 years ago
- This repository categorizes the papers about diffusion models applied in computer vision according to their target task. The classifcatio…☆400Updated last year
- High-performance Image Tokenizers for VAR and AR☆276Updated 2 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆178Updated this week
- This repo contains the code for 1D tokenizer and generator☆949Updated 3 months ago
- [ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to do…☆527Updated last year
- [Doc] Productive Deep Learner☆14Updated 5 months ago
- ☆29Updated last week
- Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"☆79Updated 3 months ago
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆1,194Updated 4 months ago
- ☆577Updated 7 months ago
- [ECCV 2024] FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing☆35Updated last week
- 🚀 Cross attention map tools for huggingface/diffusers☆319Updated 6 months ago