knightyxp/DGL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/knightyxp/DGL)

knightyxp / DGL

[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.

☆49

Alternatives and similar repositories for DGL

Users that are interested in DGL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HuiGuanLab / ms-sl
View on GitHub
Source code of our MM'22 paper Partially Relevant Video Retrieval
☆56Nov 4, 2024Updated last year
knightyxp / EVA_Video_Edit
View on GitHub
EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
☆30Mar 29, 2024Updated 2 years ago
Monoxide-Chen / uncertainty_retrieval
View on GitHub
ICLR‘24 Offical Implementation of Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
☆74Jan 30, 2024Updated 2 years ago
iLearn-Lab / MM23-RTQ
View on GitHub
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model
☆15Apr 7, 2026Updated 3 months ago
cnsdqd-dyb / Guide-GRPO
View on GitHub
Aims for memory-efficient training (24GB VRAM) on consumer GPUs. Optimizing language models through guidance tokens in reasoning chains, …
☆28Feb 23, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
pansanity666 / TransHuman
View on GitHub
Official code for ICCV 2023 paper: "TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering".
☆67Jan 11, 2024Updated 2 years ago
CUHK-AIM-Group / MCPL
View on GitHub
MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model (Initial Version)
☆13Apr 17, 2024Updated 2 years ago
kyegomez / MAGVIT2
View on GitHub
Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"
☆15Nov 11, 2024Updated last year
huangmozhi9527 / GMMFormer
View on GitHub
[AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
☆21May 10, 2024Updated 2 years ago
hrtang22 / MUSE
View on GitHub
Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"
☆26Feb 2, 2025Updated last year
ailab-kyunghee / CM2_DVC
View on GitHub
[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval
☆66Jun 19, 2024Updated 2 years ago
OmkarThawakar / composed-video-retrieval
View on GitHub
Composed Video Retrieval
☆62May 2, 2024Updated 2 years ago
hlchen23 / VERIFIED
View on GitHub
Official repository of NeurIPS D&B Track 2024 paper "VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understan…
☆40Jan 20, 2025Updated last year
EasonXiao-888 / UVCOM
View on GitHub
[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
☆117Jul 17, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LunarShen / DsicoVLA
View on GitHub
[CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
☆22Jun 23, 2025Updated last year
huzhengdongcs / DAC-DETR
View on GitHub
[NIPS2023] This is an official implementation of paper "DAC-DETR: Divide the Attention Layers and Conquer".
☆64Jun 26, 2024Updated 2 years ago
HuiGuanLab / DL-DKD
View on GitHub
Source code of the paper Dual Learning with Dynamic Knowledge Distillation and Soft Alignment for Partially Relevant Video Retrieval
☆19May 13, 2026Updated 2 months ago
houzhijian / CONE
View on GitHub
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
☆31Aug 5, 2023Updated 2 years ago
HCIS-Lab / Action-slot
View on GitHub
[CVPR 2024] Action-slot: Visual Action-centric Representations for Atomic Activity Recognition in Traffic Scenes
☆25Apr 28, 2025Updated last year
ruc-aimc-lab / TeachCLIP
View on GitHub
[CVPR 2024] TeachCLIP for Text-to-Video Retrieval
☆42May 7, 2025Updated last year
knightyxp / VideoGrain
View on GitHub
[ICLR 2025] VideoGrain: This repo is the official implementation of "VideoGrain: Modulating Space-Time Attention for Multi-Grained Video …
☆159Mar 24, 2025Updated last year
foolwood / DRL
View on GitHub
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval
☆96Apr 7, 2022Updated 4 years ago
refkxh / C-Instructor
View on GitHub
[ECCV 2024] Official implementation of C-Instructor: Controllable Navigation Instruction Generation with Chain of Thought Prompting
☆31Dec 16, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JaydenZeng / EMMR
View on GitHub
Mitigating Inconsistencies in Multimodal Sentiment Analysis under Uncertain Missing Modalities, in EMNLP2022
☆28Jun 21, 2022Updated 4 years ago
yuexihang / DeltaPhi
View on GitHub
Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"
☆13Jun 17, 2024Updated 2 years ago
gyxxyg / TRACE
View on GitHub
[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
☆156Aug 22, 2025Updated 11 months ago
bladewaltz1 / PromptSwitch
View on GitHub
☆30Aug 14, 2023Updated 2 years ago
Disguiser15 / RefTeacher
View on GitHub
RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.
☆14May 26, 2023Updated 3 years ago
mengcaopku / DCNet
View on GitHub
[ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension
☆15Sep 4, 2022Updated 3 years ago
ninatu / howtocaption
View on GitHub
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
☆59Aug 19, 2025Updated 11 months ago
lucas-ventura / CoVR
View on GitHub
Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".
☆119Apr 21, 2026Updated 3 months ago
musicman217 / Text-Proxy
View on GitHub
Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval -- AAAI2025
☆21May 8, 2026Updated 2 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
jinhyunj / EaTR
View on GitHub
Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)
☆55Sep 7, 2023Updated 2 years ago
sapeirone / EgoPack
View on GitHub
Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…
☆24Jun 13, 2024Updated 2 years ago
wangpengnorman / KB-Ref_dataset
View on GitHub
☆16Dec 28, 2020Updated 5 years ago
dhg-wei / MCL
View on GitHub
(ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
☆28Sep 27, 2024Updated last year
snumprlab / isr-dpo
View on GitHub
Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)
☆23Nov 25, 2025Updated 8 months ago
Oliiveralien / MDMS
View on GitHub
Multi-Domain Multi-Scale Diffusion Model for Low-Light Image Enhancement (AAAI'24)
☆45Mar 1, 2025Updated last year
zjuruizhechen / TVG-R1
View on GitHub
[EMNLP 2025 Industry] Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
☆36Oct 22, 2025Updated 9 months ago