vinthony / academic
Yet Another Academic Homepage Template
☆20Updated last week
Alternatives and similar repositories for academic
Users that are interested in academic are comparing it to the libraries listed below
Sorting:
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆37Updated 2 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆59Updated 2 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆58Updated 7 months ago
- ☆42Updated last year
- Official code for MotionBench (CVPR 2025)☆37Updated 2 months ago
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆77Updated 3 months ago
- Language Repository for Long Video Understanding☆31Updated 11 months ago
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆86Updated last year
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆31Updated 11 months ago
- ☆22Updated 6 months ago
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆88Updated last year
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Updated 4 months ago
- [ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration☆48Updated 2 weeks ago
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆35Updated last year
- [ECCV2022] A PyTorch implementation of the paper "Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embo…☆13Updated 2 years ago
- ☆61Updated last year
- Personalized Representation from Personalized Generation (ICLR 2025)☆64Updated 2 months ago
- Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"☆61Updated last year
- 🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)☆40Updated 3 months ago
- Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)☆32Updated last year
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆91Updated last week
- Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"☆25Updated last year
- Unifying Specialized Visual Encoders for Video Language Models☆18Updated last week
- [ICCV2023] EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding☆76Updated last year
- Code for the paper "GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos" published at CVPR 2024☆51Updated last year
- ☆29Updated 10 months ago
- Egocentric Video Understanding Dataset (EVUD)☆29Updated 10 months ago
- ECCV 2024 paper template☆50Updated last year
- FunQA benchmarks funny, creative, and magic videos for challenging tasks including timestamp localization, video description, reasoning, …☆101Updated 5 months ago
- Official PyTorch Implementation for Diffusion Hyperfeatures, NeurIPS 2023☆101Updated 6 months ago