dvlab-research / LISALinks

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

☆2,535

Alternatives and similar repositories for LISA

Users that are interested in LISA are comparing it to the libraries listed below

Sorting:

ZrrSkywalker / Personalize-SAM
Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds
☆1,639Updated last year
UX-Decoder / Semantic-SAM
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
☆2,787Updated 5 months ago
jianzongwu / Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
☆973Updated 9 months ago
FoundationVision / GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
☆1,167Updated last year
baaivision / EVA
EVA Series: Visual Representation Fantasies from BAAI
☆2,628Updated last year
OpenGVLab / VisionLLM
VisionLLM Series
☆1,131Updated 10 months ago
baaivision / Painter
Painter & SegGPT Series: Vision Foundation Models from BAAI
☆2,587Updated last year
mbzuai-oryx / groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…
☆935Updated 4 months ago
CircleRadon / Osprey
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
☆837Updated 4 months ago
SunzeY / AlphaCLIP
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
☆859Updated 5 months ago
microsoft / GLIP
Grounded Language-Image Pre-training
☆2,559Updated last year
tianrun-chen / SAM-Adapter-PyTorch
Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
☆1,398Updated 3 weeks ago
ytongbai / LVM
☆1,838Updated last year
fudan-zvg / Semantic-Segment-Anything
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
☆2,297Updated 2 years ago
jingyi0000 / VLM_survey
Collection of AWESOME vision-language models for vision tasks
☆3,039Updated 2 months ago
Computer-Vision-in-the-Wild / CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
☆1,350Updated last year
HarborYuan / ovsam
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
☆1,027Updated 4 months ago
UX-Decoder / Segment-Everything-Everywhere-All-At-Once
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
☆4,754Updated last year
z-x-yang / Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary alg…
☆3,087Updated last year
yformer / EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
☆2,456Updated last year
OpenGVLab / InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
☆2,145Updated last week
microsoft / X-Decoder
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
☆1,337Updated 2 years ago
xinyu1205 / recognize-anything
Open-source and strong foundation image recognition models.
☆3,536Updated 10 months ago
Hedlen / awesome-segment-anything
Tracking and collecting papers/projects/others related to Segment Anything.
☆1,678Updated 9 months ago
OFA-Sys / ONE-PEACE
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model To…
☆1,062Updated last year
baaivision / Emu
Emu Series: Generative Multimodal Models from BAAI
☆1,761Updated last year
IDEA-Research / Grounding-DINO-1.5-API
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
☆1,072Updated 11 months ago
shikras / shikra
☆800Updated last year
BAAI-DCAI / Bunny
A family of lightweight multimodal models.
☆1,049Updated last year
NVlabs / ODISE
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
☆930Updated last year