xiaoqian-shen / VgentView external linksLinks
[NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent
☆39Nov 30, 2025Updated 2 months ago
Alternatives and similar repositories for Vgent
Users that are interested in Vgent are comparing it to the libraries listed below
Sorting:
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rou…☆34Sep 25, 2025Updated 4 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆54Mar 9, 2025Updated 11 months ago
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆177Dec 19, 2025Updated last month
- ☆82Jan 18, 2026Updated 3 weeks ago
- ☆37Sep 16, 2024Updated last year
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆75May 31, 2025Updated 8 months ago
- (ICLR 2026 🔥) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆74Feb 9, 2026Updated last week
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 10 months ago
- ☆18Feb 16, 2025Updated last year
- ☆11Oct 31, 2024Updated last year
- ☆22Dec 11, 2025Updated 2 months ago
- ☆13Jul 3, 2024Updated last year
- ☆15Sep 11, 2025Updated 5 months ago
- ☆21Feb 3, 2026Updated 2 weeks ago
- ☆13May 15, 2025Updated 9 months ago
- CVPR 2025 Accepted Papers☆23Dec 20, 2025Updated last month
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆54Jul 14, 2025Updated 7 months ago
- ☆20Nov 21, 2025Updated 2 months ago
- ☆21Jun 16, 2025Updated 8 months ago
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆20Sep 24, 2025Updated 4 months ago
- https://avocado-captioner.github.io/☆28Oct 16, 2025Updated 4 months ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 4 months ago
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Nov 25, 2025Updated 2 months ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- Long Context Research☆26Jan 26, 2026Updated 3 weeks ago
- Official PyTorch implementation of the paper "FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing"☆76Dec 12, 2025Updated 2 months ago
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆134Jul 28, 2025Updated 6 months ago
- ☆13Apr 2, 2024Updated last year
- ☆18Jun 14, 2025Updated 8 months ago
- A local search system implementation using Elasticsearch for Wikipedia data indexing and retrieval.☆12May 17, 2025Updated 9 months ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- ☆12Jul 13, 2025Updated 7 months ago
- Code for the "Long Context Needs Some R&R" paper.☆12Mar 11, 2024Updated last year
- Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"☆15Aug 27, 2025Updated 5 months ago
- python 实现的微信自动回复机器人☆11Nov 16, 2019Updated 6 years ago
- [Neurocomputing] Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation☆22Dec 21, 2025Updated last month
- v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning☆18Oct 6, 2025Updated 4 months ago
- ☆63Feb 4, 2026Updated last week
- ☆31Dec 4, 2025Updated 2 months ago