vinthony / academic
Yet Another Academic Homepage Template
☆19Updated this week
Alternatives and similar repositories for academic:
Users that are interested in academic are comparing it to the libraries listed below
- ☆70Updated last month
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆58Updated 6 months ago
- ☆17Updated 5 months ago
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆37Updated last month
- 🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)☆31Updated 2 months ago
- Official Release of NeurIPS 2023 Spotlight paper "Object-Centric Slot Diffusion"☆64Updated last year
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆85Updated last year
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆28Updated 4 months ago
- [CVPR 2025] Few-shot Recognition via Stage-Wise Retrieval-Augmented Finetuning☆14Updated last week
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆41Updated 2 months ago
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆44Updated last year
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆44Updated 3 weeks ago
- Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)☆32Updated last year
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆56Updated last year
- ☆29Updated 9 months ago
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆31Updated 2 months ago
- ☆123Updated 2 months ago
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆17Updated 2 weeks ago
- Training code for CLIP-FlanT5☆26Updated 8 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆64Updated 4 months ago
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆44Updated 3 months ago
- Language Repository for Long Video Understanding☆31Updated 9 months ago
- ☆42Updated last year
- [CVPR 2023] Zero-shot Generative Model Adaptation via Image-specific Prompt Learning☆84Updated last year
- Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"☆58Updated last year
- LAVIS - A One-stop Library for Language-Vision Intelligence☆47Updated 7 months ago
- Interpretable Diffusion Via Information Decomposition☆28Updated 8 months ago
- Personalized Representation from Personalized Generation (ICLR 2025)☆54Updated 3 weeks ago
- [CVPR 2024 Highlight] ImageNet-D☆41Updated 5 months ago
- ☆59Updated last year