wl-zhao / THU-CoursesLinks
☆17Updated 3 years ago
Alternatives and similar repositories for THU-Courses
Users that are interested in THU-Courses are comparing it to the libraries listed below
Sorting:
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆117Updated 10 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆75Updated 9 months ago
- [CVPR'25] A vision question answering (VQA) benchmark for 6D spatial reasoning.☆10Updated 2 months ago
- Official Implementation of Paper Transfer between Modalities with MetaQueries☆222Updated last month
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆91Updated last year
- SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enh…☆32Updated last year
- Being-VL-0.5: Unified Multimodal Understanding via Byte-Pair Visual Encoding☆23Updated last month
- Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation☆139Updated 3 months ago
- ☆11Updated last month
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆242Updated last year
- The author's implementation of FUDOKI, a multimodal large language model purely based on discrete flow matching.☆48Updated 3 weeks ago
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆182Updated 7 months ago
- Sample LaTex file for HKU PhD thesis.☆26Updated 3 years ago
- Python logging package for easy reproducible experimenting in research☆40Updated last month
- A framework that allows you to apply Sparse AutoEncoder on any models☆36Updated last month
- Documents used for grad school application☆302Updated 4 years ago
- [arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"☆135Updated last week
- A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application☆306Updated 7 months ago
- a reading list for human-centered AI☆44Updated 3 years ago
- ☆31Updated 3 months ago
- Accepted by CVPR 2024☆37Updated last year
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆33Updated 6 months ago
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆161Updated 2 weeks ago
- Official Implementation of VideoDPO☆137Updated 3 months ago
- ☆258Updated last year
- Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).☆39Updated last year
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆34Updated last year
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Updated 2 years ago
- ☆113Updated last year
- NeurIPS'2022: Pluralistic Image Completion with Gaussian Mixture Models☆14Updated 2 years ago