xiaoqian-shen/Vgent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xiaoqian-shen/Vgent)

xiaoqian-shen / Vgent

[NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent

☆48

Alternatives and similar repositories for Vgent

Users that are interested in Vgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆77Feb 25, 2026Updated 4 months ago
CASIA-IVA-Lab / VideoNIAH
View on GitHub
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆57Mar 9, 2025Updated last year
liunian-Jay / GainRAG
View on GitHub
[ACL 2025] GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis
☆36Aug 10, 2025Updated 10 months ago
SoruxBot / SoruxBot
View on GitHub
A cross-chatting-platform bot framework.一个跨聊天平台的支持多语言开发的跨平台机器人框架。
☆15Jan 25, 2023Updated 3 years ago
ncTimTang / AKS
View on GitHub
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆226Dec 19, 2025Updated 6 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
xuyang-liu16 / V2Drop
View on GitHub
[CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
☆32May 27, 2026Updated last month
nianfd / RWKV-VG
View on GitHub
☆10Dec 3, 2024Updated last year
Yubo-Shankui / Bind-Your-Avatar-Implementation
View on GitHub
(CVPR 26 Findings) Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-…
☆34Apr 7, 2026Updated 3 months ago
zhuyjan / MER2025-MRAC25
View on GitHub
[ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.
☆25Nov 25, 2025Updated 7 months ago
HanNight / AdaCAD
View on GitHub
Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"
☆16Mar 2, 2026Updated 4 months ago
weijielyu / FaceCam
View on GitHub
[CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
☆62Mar 26, 2026Updated 3 months ago
aimagelab / ReflectiVA
View on GitHub
[CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
☆56Jul 14, 2025Updated 11 months ago
GasolSun36 / SURf
View on GitHub
[EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information
☆11Oct 11, 2024Updated last year
bscho333 / ReVisiT
View on GitHub
[ACL 2026 Main] Revisit What You See: Revealing Visual Semantics in Vision Tokens to Guide LVLM Decoding
☆26Nov 21, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Liu233w / CapsLocKeys.ahk
View on GitHub
An autohotkey's script that makes your capslock more powerful
☆14Aug 3, 2018Updated 7 years ago
G-JWLee / TAMP
View on GitHub
☆12May 15, 2025Updated last year
onecun / Eeader
View on GitHub
本地解析+存储的Epub电子书阅读器
☆10Jul 11, 2023Updated 2 years ago
sail-sg / Video-Next-Event-Prediction
View on GitHub
☆28Aug 9, 2025Updated 11 months ago
lchen1019 / ERR-Seg
View on GitHub
[Neurocomputing] Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation
☆25Dec 21, 2025Updated 6 months ago
CnFaker / LLaVA-SP
View on GitHub
[ICCV 2025] The official pytorch implement of "LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs".
☆24Oct 28, 2025Updated 8 months ago
gszfwsb / AutoGnothi
View on GitHub
Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"
☆23Mar 4, 2025Updated last year
zaiquanyang / LLaVA_Next_STVG
View on GitHub
LLaVA-Next for STVG
☆21Dec 5, 2025Updated 7 months ago
amandpkr / Efficient-3D-Aware-Facial-Image-Editing
View on GitHub
[ECCV 2024] Official code repository of paper titled "Efficient 3D-Aware Facial Image Editing Via Attribute-Specific Prompt Learning"
☆10Aug 2, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Vision-CAIR / Infinibench
View on GitHub
Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
☆20Nov 4, 2025Updated 8 months ago
zrf-rocket / PracticeSpider
View on GitHub
scrapy、pyspider、appium、beautiful soup、selenium、uiautomator2等爬虫技术。漏洞信息、威胁情报、舆情分析、自媒体平台信息、电商平台商品信息等爬虫。
☆10Oct 20, 2023Updated 2 years ago
hhnqqq / py_hfd
View on GitHub
A python script for downloading huggingface datasets and models.
☆20Apr 10, 2025Updated last year
zhujinliang / chinesetokenization
View on GitHub
chinesetokenization
☆13Jun 4, 2013Updated 13 years ago
leeruibin / MfM
View on GitHub
[ICLR 2026] Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks
☆31Feb 5, 2026Updated 5 months ago
z-lab / flash-colreduce
View on GitHub
Fast, memory-efficient attention column reduction (e.g., sum, mean, max)
☆49Feb 10, 2026Updated 4 months ago
swarupbehera / awesome-audio-visual-question-answering
View on GitHub
A curated list of resources in audio visual question answering and related area. :-)
☆17Jun 29, 2025Updated last year
Lou1sM / meaningful_image_complexity
View on GitHub
☆17Mar 24, 2025Updated last year
MrZilinXiao / ProxyThinker
View on GitHub
[ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.
☆22Sep 24, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AVoCaDO-Captioner / AVoCaDO
View on GitHub
https://avocado-captioner.github.io/
☆37Oct 16, 2025Updated 8 months ago
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
kodenii / ORES
View on GitHub
ORES: Open-vocabulary Responsible Visual Synthesis
☆14Dec 12, 2023Updated 2 years ago
Tianhao-Qi / Mask2DiT
View on GitHub
CVPR 2025 Accepted Papers
☆26Dec 20, 2025Updated 6 months ago
yangluo7 / V-ReasonBench
View on GitHub
☆36Feb 18, 2026Updated 4 months ago
GXYM / VCapsBench
View on GitHub
VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation
☆20Jun 2, 2025Updated last year
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 5 months ago