jongwoopark7978/LVNet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jongwoopark7978/LVNet)

jongwoopark7978 / LVNet

[Main Conference @ EACL'26] [Workshop @ NeurIPS'24] 🎞️ LVNet.

☆44

Alternatives and similar repositories for LVNet

Users that are interested in LVNet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kahnchana / LangToMo
View on GitHub
[WIP] Code for LangToMo
☆21Mar 19, 2026Updated 4 months ago
kahnchana / mvu
View on GitHub
🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)
☆58Jan 31, 2025Updated last year
cfmata / CoPT
View on GitHub
[ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings
☆10Feb 24, 2025Updated last year
LostXine / open_x_pytorch_dataloader
View on GitHub
An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodiment
☆25Jan 9, 2025Updated last year
ADL-X / LLAVIDAL
View on GitHub
This is the offical repository of LLAVIDAL
☆25Oct 4, 2025Updated 9 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Charlotte-CharMLab / Fibottention
View on GitHub
Official Repository of "Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads"
☆17Oct 6, 2025Updated 9 months ago
MGitHubL / TMac
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
motional / motional-prediction-devkit
View on GitHub
☆18Dec 17, 2022Updated 3 years ago
dhg-wei / TOPA
View on GitHub
(NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment
☆29Sep 27, 2024Updated last year
TritonPaper / TRITON
View on GitHub
☆14Jun 25, 2022Updated 4 years ago
kkahatapitiya / LangRepo
View on GitHub
Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
☆36Jun 17, 2024Updated 2 years ago
RyannDaGreat / rp
View on GitHub
This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires python≥3.5
☆13Jul 13, 2026Updated last week
CeeZh / LLoVi
View on GitHub
Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"
☆106Oct 27, 2024Updated last year
kahnchana / clippy
View on GitHub
Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)
☆37Jan 1, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
LostXine / crossway_diffusion
View on GitHub
[ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
☆72Aug 4, 2024Updated last year
elicassion / active-gym
View on GitHub
Environments for Active Vision Reinforcement Learning
☆30Oct 10, 2024Updated last year
WHB139426 / GCG
View on GitHub
Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]
☆10Jul 22, 2024Updated last year
SalesforceAIResearch / FOFPred
View on GitHub
☆39Jun 2, 2026Updated last month
FreedomIntelligence / TRIM
View on GitHub
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆22Jan 11, 2026Updated 6 months ago
zhiyuanhubj / Long_form_VideoQA
View on GitHub
[EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering
☆18Oct 9, 2024Updated last year
LostXine / LLaRA
View on GitHub
[ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
☆229Mar 29, 2025Updated last year
qirui-chen / MultiHop-EgoQA
View on GitHub
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆38May 27, 2025Updated last year
Ziyang412 / VideoTree
View on GitHub
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆165Jun 23, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
srijandas07 / VPN
View on GitHub
Pose driven attention mechanism
☆44Mar 31, 2022Updated 4 years ago
MAC-AutoML / QuoTA
View on GitHub
✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…
☆79Apr 28, 2025Updated last year
wxh1996 / VideoAgent
View on GitHub
☆150Apr 16, 2025Updated last year
bigai-nlco / VideoTGB
View on GitHub
[EMNLP 2024] A Video Chat Agent with Temporal Prior
☆33Mar 2, 2025Updated last year
cankocagil / TT-SRN
View on GitHub
TT-SPN: Twin Transformers with Sinusoidal Representation Networks for Video Instance Segmentation
☆16Oct 8, 2021Updated 4 years ago
yl3800 / TranSTR
View on GitHub
☆12Dec 15, 2023Updated 2 years ago
elicassion / sugarl
View on GitHub
Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"
☆56Oct 10, 2024Updated last year
gls0425 / LinVT
View on GitHub
LinVT: Empower Your Image-level Large Language Model to Understand Videos
☆83Dec 30, 2024Updated last year
doc-doc / NExT-GQA
View on GitHub
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
☆89Jul 1, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
sangminwoo / AvisC
View on GitHub
[ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vis…
☆25Jul 21, 2024Updated last year
wangjiarui153 / AIGV-Assessor
View on GitHub
[CVPR 2025] AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM
☆18Mar 19, 2026Updated 4 months ago
yiming-j / SPLINE-Net
View on GitHub
SPLINE-Net: Sparse Photometric Stereo through Lighting Interpolation and Normal Estimation Networks
☆11Apr 13, 2023Updated 3 years ago
zhengrongz / AoTD
View on GitHub
[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".
☆58May 25, 2025Updated last year
kahnchana / svt
View on GitHub
Official repository for "Self-Supervised Video Transformer" (CVPR'22)
☆109Jun 26, 2024Updated 2 years ago
jiefeng0109 / RLSBS
View on GitHub
Deep Reinforcement Learning for Semisupervised Hyperspectral Band Selection
☆11Jun 30, 2024Updated 2 years ago
sangminwoo / awesome-token-redundancy-reduction
View on GitHub
😎 Awesome papers on token redundancy reduction
☆14Mar 12, 2025Updated last year