mightyzau/RegionBLIP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mightyzau/RegionBLIP)

mightyzau / RegionBLIP

☆59

Alternatives and similar repositories for RegionBLIP

Users that are interested in RegionBLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

PVIT-official / PVIT
View on GitHub
Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
☆37Sep 19, 2023Updated 2 years ago
mightyzau / InfMLLM
View on GitHub
☆19Dec 6, 2023Updated 2 years ago
jshilong / GPT4RoI
View on GitHub
(ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
☆556Jun 3, 2025Updated last year
icoz69 / StableLLAVA
View on GitHub
Official repo for StableLLAVA
☆94Dec 22, 2023Updated 2 years ago
jamessealesmith / ConStruct-VL
View on GitHub
PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"
☆13Feb 5, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xing0047 / rewrite
View on GitHub
[NeurIPS 2023] Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation
☆21Jan 3, 2024Updated 2 years ago
Qinying-Liu / TagAlign
View on GitHub
Official implementation of TagAlign
☆37Dec 11, 2024Updated last year
SY-Xuan / Pink
View on GitHub
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
☆100Jan 16, 2025Updated last year
showlab / macosworld
View on GitHub
☆35Jan 28, 2026Updated 5 months ago
YuchenLiu98 / COMM
View on GitHub
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
☆211Jan 8, 2025Updated last year
emu1729 / GIST
View on GitHub
Generating Image Specific Text
☆29Aug 14, 2023Updated 2 years ago
ml-jku / semantic-image-text-alignment
View on GitHub
☆25Jul 10, 2023Updated 3 years ago
zxytim / pdf2images
View on GitHub
Convert pdf to pages of images
☆13Apr 18, 2020Updated 6 years ago
sail-sg / ptp
View on GitHub
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
☆150Jun 7, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
thunlp / CPT
View on GitHub
Colorful Prompt Tuning for Pre-trained Vision-Language Models
☆49Nov 1, 2022Updated 3 years ago
linhuixiao / CLIP-VG
View on GitHub
[TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.
☆135Nov 10, 2025Updated 8 months ago
OpenGVLab / all-seeing
View on GitHub
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of …
☆508Aug 9, 2024Updated last year
FuxiaoLiu / LRV-Instruction
View on GitHub
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆297Mar 13, 2024Updated 2 years ago
xiangyu-mm / EasyGen
View on GitHub
The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"
☆73Nov 21, 2024Updated last year
Leon1207 / 3DRefTR
View on GitHub
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
☆26Aug 24, 2023Updated 2 years ago
ForrestPi / ObjectDetection
View on GitHub
some object detection algo
☆14Jul 25, 2024Updated last year
YeLuoSuiYou / openstorypp
View on GitHub
We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.
☆18Aug 30, 2024Updated last year
fjhzhixi / 3D-SPS
View on GitHub
☆64May 17, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yuhangzang / ContextDET
View on GitHub
Contextual Object Detection with Multimodal Large Language Models
☆261Oct 14, 2024Updated last year
LukeForeverYoung / QRNet
View on GitHub
☆41Jun 3, 2022Updated 4 years ago
katieluo88 / DRIFT
View on GitHub
☆16Jun 5, 2024Updated 2 years ago
shikras / shikra
View on GitHub
☆814Jul 8, 2024Updated 2 years ago
RUCAIBox / ComVint
View on GitHub
The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…
☆19Nov 10, 2023Updated 2 years ago
OpenGVLab / LAMM
View on GitHub
[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
☆317Apr 16, 2024Updated 2 years ago
shikras / d-cube
View on GitHub
A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating…
☆138Mar 20, 2024Updated 2 years ago
ver0z / Deformable-DETR-
View on GitHub
Deformable DETR: Deformable Transformers for End-to-End Object Detection. This is an alternative for running custom datasets on Deformabl…
☆14Jan 24, 2022Updated 4 years ago
JiwanChung / vlis
View on GitHub
☆24Oct 9, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
BAAI-DCAI / DataOptim
View on GitHub
A collection of visual instruction tuning datasets.
☆77Mar 14, 2024Updated 2 years ago
Chat-3D / Chat-3D
View on GitHub
Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"
☆57Mar 28, 2024Updated 2 years ago
mlfoundations / VisIT-Bench
View on GitHub
☆51Oct 29, 2023Updated 2 years ago
allenai / reclip
View on GitHub
☆92Apr 15, 2022Updated 4 years ago
yangyangyang127 / APE
View on GitHub
[ICCV 2023] Code for "Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement"
☆150Apr 21, 2024Updated 2 years ago
sosppxo / 3D-STMN
View on GitHub
[AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Refer…
☆46Dec 20, 2023Updated 2 years ago
Charles-Xie / awesome-described-object-detection
View on GitHub
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring E…
☆358Nov 6, 2025Updated 8 months ago