vinthony / academicLinks

Yet Another Academic Homepage Template

☆21

Alternatives and similar repositories for academic

Users that are interested in academic are comparing it to the libraries listed below

Sorting:

Nicous20 / FunQA
FunQA benchmarks funny, creative, and magic videos for challenging tasks including timestamp localization, video description, reasoning, …
☆104Updated 10 months ago
seervideodiffusion / SeerVideoLDM
[ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models
☆33Updated last year
BolinLai / LEGO
[ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…
☆38Updated 7 months ago
IDEA-Research / DiffHOI
Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"
☆64Updated 2 years ago
Video-as-Agent / VideoAgent
Official implementation of "Self-Improving Video Generation"
☆74Updated 5 months ago
yukw777 / EILEV
EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties
☆131Updated 11 months ago
Max-Fu / tvl
[ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment
☆84Updated 4 months ago
yukw777 / VideoBLIP
Supercharged BLIP-2 that can handle videos
☆122Updated last year
human-centeredAI / awesomeHAI
a reading list for human-centered AI
☆44Updated 3 years ago
TonyLianLong / LLM-groundedVideoDiffusion
[ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper
☆158Updated last year
liveseongho / Awesome-Video-Language-Understanding
A Survey on video and language understanding.
☆50Updated 2 years ago
xvjiarui / IMProv
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
☆57Updated last year
showlab / VisorGPT
[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT
☆135Updated last year
facebookresearch / EgoVLPv2
Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]
☆99Updated last year
keunhong / keunhong.github.io
☆75Updated 2 months ago
Wuziyi616 / SlotDiffusion
Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models
☆92Updated last year
LilyDaytoy / OpenPVSG
Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23
☆97Updated last year
xk-huang / segment-caption-anything
[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloadin…
☆228Updated last year
nanlliu / Unsupervised-Compositional-Concepts-Discovery
[ICCV 2023] Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
☆85Updated 2 years ago
snap-research / MyVLM
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
☆179Updated last year
LargeWorldModel / ElasticTok
ElasticTok: Adaptive Tokenization for Image and Video
☆80Updated 11 months ago
shashankvkt / DoRA_ICLR24
This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …
☆93Updated last year
renwang435 / video-ttt-release
Test-Time Training on Video Streams
☆64Updated 2 years ago
facebookresearch / EgoT2
Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)
☆33Updated 2 years ago
TIGER-AI-Lab / VideoScore
official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]
☆101Updated 8 months ago
Yui010206 / CREMA
[ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
☆53Updated 3 months ago
ndrwmlnk / Awesome-Video-Diffusion-Models
☆51Updated 8 months ago
hananshafi / llmblueprint
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
☆81Updated last year
amazon-science / AdaSlot
Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]
☆57Updated 8 months ago
tsunghan-wu / SLD
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
☆182Updated last year