[TMM 2025] This is the official Pytorch code for our paper "Visual Position Prompt for MLLM based Visual Grounding".
☆29Jul 23, 2025Updated 8 months ago
Alternatives and similar repositories for VPP-LLaVA
Users that are interested in VPP-LLaVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the official Pytorch code for our paper "Artemis: Structured Visual Reasoning for Perception Policy Learning".☆14Dec 4, 2025Updated 3 months ago
- 16k Hz Vocoder (HiFiGAN Codes and Pretrained Models)☆18Apr 3, 2023Updated 2 years ago
- [TIP] Exploring Effective Factors for Improving Visual In-Context Learning☆20Jul 2, 2025Updated 8 months ago
- ☆10Jan 28, 2024Updated 2 years ago
- ACM MM 2022 paper_AVQA: A Dataset for Audio-Visual Question Answering on Videos☆16Aug 17, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…☆10Feb 9, 2025Updated last year
- Unofficial version of LaneExtraction☆13Oct 12, 2022Updated 3 years ago
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Mar 6, 2023Updated 3 years ago
- [ACL 2021] This is the Pytorch code for our paper "Semantic Relation-aware Difference Representation Learning for Change Captioning".☆13Jan 16, 2022Updated 4 years ago
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆24Mar 6, 2026Updated 2 weeks ago
- RPIfield dataset for Person Re-identification☆13Aug 17, 2020Updated 5 years ago
- Official PyTorch Implementation of Exploring Stochastic Autoregressive Image Modeling for Visual Representation, Accepted by AAAI 2023.☆16Jul 3, 2023Updated 2 years ago
- ☆13May 21, 2023Updated 2 years ago
- Reproducing the results of https://arxiv.org/abs/1712.05790☆19Nov 21, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆14Oct 11, 2023Updated 2 years ago
- [CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception☆22Jun 17, 2025Updated 9 months ago
- Code for EMNLP 2022 main conference paper "Low-resource Neural Machine Translation with Cross-modal Alignment".☆15Apr 25, 2023Updated 2 years ago
- this repo contains some useful metadata for Fashion IQ challenge: https://sites.google.com/view/lingir/fashion-iq☆15Jun 28, 2019Updated 6 years ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- implementation of TDConvED for video captioning☆13Mar 18, 2020Updated 6 years ago
- Code for the EACL 2024 paper: "Small Language Models Improve Giants by Rewriting Their Outputs"☆12Apr 20, 2024Updated last year
- Reinforcement Learning attempts to beat Contra 3 for the SNES☆14Feb 16, 2019Updated 7 years ago
- 正方教务系统登录抓取成绩☆12May 10, 2015Updated 10 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆16Oct 11, 2025Updated 5 months ago
- Frequently updated list of dLLM (Diffusion Large Language Models) papers, models, and other resources☆24Jan 30, 2026Updated last month
- Code for ECCV 2020 paper "Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language"☆17Aug 25, 2020Updated 5 years ago
- Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*☆15Apr 6, 2021Updated 4 years ago
- [IEEE TMM 2023] This is the Pytorch code for our paper "Neighborhood Contrastive Transformer for Change Captioning".☆12Aug 30, 2023Updated 2 years ago
- Re-implementation of Progressive Neural Networks with PyTorch☆15Jul 25, 2024Updated last year
- 2020年秋国科大模式识别(刘成林、向世明、张煦尧)课后作业☆10Feb 3, 2021Updated 5 years ago
- Make wonderful CVs with web technologies.☆14Nov 5, 2021Updated 4 years ago
- [AAAI 2026] Relation-R1: Progressively Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relation Comprehension☆18Mar 6, 2026Updated 2 weeks ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…☆23Jul 28, 2025Updated 7 months ago
- 《寒蝉鸣泣之时》系列简体中文汉化补丁网站☆16Jan 6, 2026Updated 2 months ago
- End-to-end implementation of the Social Graph Network (SGN), described in the Structural Reasoning for Image-based Social Relation Recogn…☆13Apr 3, 2024Updated last year
- ☆14Mar 11, 2024Updated 2 years ago
- ☆16Nov 29, 2014Updated 11 years ago
- Attention Unet and Deep Unet implementation for road extraction multi-gpu tensorflow☆18Feb 22, 2021Updated 5 years ago
- Exploring CoT-Decoding from Google DeepMind's paper, "Chain-of-Thought Reasoning Without Prompting".☆13Feb 22, 2024Updated 2 years ago