SenseTime-FVG / UniMLVGLinks
☆9Updated 6 months ago
Alternatives and similar repositories for UniMLVG
Users that are interested in UniMLVG are comparing it to the libraries listed below
Sorting:
- Official Pytorch Implementation of "Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generati…☆9Updated 7 months ago
- DiFSD: Ego-Centric Fully Sparse Paradigm for End-to-End Self-Driving☆11Updated 4 months ago
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆13Updated last month
- (CVPR 2025 Highlight) Official repository of paper "AODRaw: Towards RAW Object Detection in Diverse Conditions" (https://arxiv.org/pdf/24…☆13Updated 3 months ago
- [ICLR 2025] Causal Graphical Models for Vision-Language Compositional Understanding☆9Updated 3 months ago
- ☆24Updated 7 months ago
- Source code repo for "AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving…☆14Updated 5 months ago
- ☆11Updated 3 months ago
- ☆8Updated 5 months ago
- SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning☆10Updated last month
- ☆13Updated 8 months ago
- ☆11Updated 3 months ago
- unofficial☆10Updated 8 months ago
- Official Implementation of Towards Open Vocabulary Video Semantic Segmentation☆10Updated 4 months ago
- Score and Distribution Matching Policy: Advanced accelerated Visuomotor Policies via matched distillation☆9Updated 2 months ago
- ☆12Updated 6 months ago
- the official implementation of GlobalMapNet☆13Updated 9 months ago
- KV cache compression via sparse coding☆11Updated 2 months ago
- Official implementation for P2SAM (ACM MM 2024)☆12Updated 7 months ago
- official repository for ATM-Traffic☆10Updated 3 months ago
- This repo contains the official code release of the Neural Experts paper, published in NeurIPS 2024.☆10Updated 7 months ago
- LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval☆8Updated 7 months ago
- ☆10Updated 7 months ago
- ☆101Updated 7 months ago
- [AAAI 2025]MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation☆27Updated last month
- SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context☆5Updated 6 months ago
- ☆22Updated 2 weeks ago
- [CVPR 2025] DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation☆52Updated last month
- This repository contains the implementation of the paper: "ChatCam: Empowering Camera Control through Conversational AI", NeurIPS 2024.☆17Updated 8 months ago
- Code for the paper "ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions" published at CVPR 2025☆15Updated 3 months ago