yixuan730/DetToolChain

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yixuan730/DetToolChain)

yixuan730 / DetToolChain

Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM

☆45

Alternatives and similar repositories for DetToolChain

Users that are interested in DetToolChain are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

THUNLP-MT / Scaffold
View on GitHub
Scaffold Prompting to promote LMMs
☆46Dec 16, 2024Updated last year
ylingfeng / FGVP
View on GitHub
Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023
☆57Feb 1, 2024Updated 2 years ago
Liqq1 / awesome-medical-vision-and-language-pretraining
View on GitHub
The collection of medical VLP papars
☆20Jul 24, 2024Updated 2 years ago
mahtabbigverdi / Aurora
View on GitHub
☆12Dec 4, 2024Updated last year
Visual-AI / v-CLR
View on GitHub
[CVPR 2025 Highlight] v-CLR: View-Consistent Learning for Open-World Instance Segmentation
☆21May 31, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
bzluan / TextCoT
View on GitHub
[ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"
☆45Feb 27, 2026Updated 4 months ago
FreedomIntelligence / Med-MAT
View on GitHub
[ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging
☆40Jun 4, 2025Updated last year
UARK-AICV / FG-CXR
View on GitHub
The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…
☆12Jul 28, 2025Updated 11 months ago
chancharikmitra / CCoT
View on GitHub
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
☆142Jun 20, 2024Updated 2 years ago
lzyhha / HSSL
View on GitHub
Enhancing Representations through Heterogeneous Self-Supervised Learning (TPAMI 2025)
☆15May 2, 2025Updated last year
meetdavidwan / crg
View on GitHub
PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"
☆39Mar 4, 2024Updated 2 years ago
zhonhel / Incremental-Object-Detection-with-Feature-Pyramid-Network-and-Knowledge-Distillation
View on GitHub
Incremental Object Detection with Feature Pyramid Network(FPN) and Knowledge Distillation.
☆12Jan 16, 2025Updated last year
2toinf / IVM
View on GitHub
[NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"
☆42Nov 15, 2024Updated last year
ggg0919 / cantor
View on GitHub
☆90May 10, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
GuangyanS / Sys2-LLaVA
View on GitHub
☆31Feb 10, 2025Updated last year
NK-JittorCV / nk-diffusion
View on GitHub
☆18Jul 2, 2026Updated 3 weeks ago
wenhaochai / claude-plugins
View on GitHub
Personal Claude Code plugin marketplace
☆16Updated this week
MCG-NKU / SERE
View on GitHub
Exploring Feature Self-relation for Self-supervised Transformer (TPAMI 2023)
☆21Apr 30, 2025Updated last year
mrwu-mac / ControlMLLM
View on GitHub
[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
☆211Jul 17, 2025Updated last year
ASGMVLP / ASGMVLP_CODE
View on GitHub
The repo of ASGMVLP
☆19Jan 16, 2026Updated 6 months ago
SooLab / DDCOT
View on GitHub
[NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
☆48Mar 18, 2024Updated 2 years ago
rorubyy / thermal_rgb_fusion_yolov8
View on GitHub
☆19Aug 7, 2023Updated 2 years ago
yu-rp / apiprompting
View on GitHub
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
☆112Oct 10, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
WillDreamer / Awesome-MLLM-Reasoning
View on GitHub
Recent Advances on MLLM's Reasoning Ability
☆26Apr 11, 2025Updated last year
scofield7419 / Video-of-Thought
View on GitHub
Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"
☆182Feb 25, 2025Updated last year
Kamichanw / MimIC
View on GitHub
[CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"
☆26May 21, 2026Updated 2 months ago
360CVGroup / LMM-Det
View on GitHub
Make Large Multimodal Models excel in object detection, ICCV 2025
☆65Aug 1, 2025Updated 11 months ago
IntMeGroup / LMM4LMM
View on GitHub
[ICCV 2025 Highlight] LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
☆20Nov 16, 2025Updated 8 months ago
ZX-Yin / DreamLifting
View on GitHub
The code implementation for the paper "DreamLifting: A Plug-in Module Lifting MV Diffusion Models for 3D Asset Generation".
☆30Sep 1, 2025Updated 10 months ago
GasolSun36 / MVP
View on GitHub
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆24Sep 9, 2024Updated last year
NNNNerd / Triple-I-Net-TINet
View on GitHub
Official code for "Illumination-guided RGBT Object Detection with Inter- and Intra-modality Fusion"
☆22Aug 13, 2024Updated last year
wannature / Detective-A-Dynamic-Integrated-Uncertainty-Valuation-Framework
View on GitHub
Pytorch implementation of Detective
☆13Jul 11, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
NastaranBa / ACE-for-Sarcasm-Detection
View on GitHub
☆11Dec 1, 2020Updated 5 years ago
JingyuanZhou / Task_Adaptive_Network
View on GitHub
☆11Nov 8, 2022Updated 3 years ago
zeng-ziyin / U-Next
View on GitHub
U-Next
☆22Jan 4, 2024Updated 2 years ago
MLLMKCBENCH / MLLMKC
View on GitHub
【AAAI 2026 🔥】A benchmark that evaluates multimodel knowledge conflicts for large multimodal model
☆25May 27, 2025Updated last year
Lackel / AGLA
View on GitHub
[CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
☆68Jul 16, 2024Updated 2 years ago
ritaranx / BMRetriever
View on GitHub
[EMNLP 2024] This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".
☆26Sep 19, 2024Updated last year
ChantalMP / RaDialog_v2
View on GitHub
LLaVa Version of RaDialog
☆26May 27, 2025Updated last year