wl-zhao / THU-CoursesLinks
☆17Updated 3 years ago
Alternatives and similar repositories for THU-Courses
Users that are interested in THU-Courses are comparing it to the libraries listed below
Sorting:
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆90Updated last year
- Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation☆138Updated last month
- Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)☆24Updated last year
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆114Updated 8 months ago
- Official implementation of Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement.☆31Updated 10 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆72Updated 8 months ago
- A framework that allows you to apply Sparse AutoEncoder on any models☆33Updated last week
- Accepted by CVPR 2024☆35Updated last year
- NeurIPS'2022: Pluralistic Image Completion with Gaussian Mixture Models☆14Updated 2 years ago
- [arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"☆130Updated 11 months ago
- Documents used for grad school application☆302Updated 4 years ago
- Chat about anything on any video!☆36Updated last year
- ☆26Updated last month
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆137Updated last week
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆34Updated last year
- [CVPR'25] A vision question answering (VQA) benchmark for 6D spatial reasoning.☆10Updated last month
- Official Implementation of Paper Transfer between Modalities with MetaQueries☆149Updated this week
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆31Updated 9 months ago
- A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application☆295Updated 5 months ago
- The official github repo for "Test-Time Training with Masked Autoencoders"☆87Updated last year
- [TIP 2023] Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition.☆13Updated last year
- Sample LaTex file for HKU PhD thesis.☆26Updated 3 years ago
- [CVPR 2024 Champions][ICLR 2025] Solutions for EgoVis Chanllenges in CVPR 2024☆127Updated 2 months ago
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Updated last year
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆38Updated 2 months ago
- Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning☆14Updated last month
- ☆117Updated 2 years ago
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆64Updated last month
- A Chrome/Edge extension to help you quickly scan through the flood of daily ArXiv papers.☆14Updated 3 months ago
- ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件☆250Updated 2 years ago