Aasthaengg / GLIP-BLIP-Vision-Langauge-Obj-Det-VQALinks

☆33

Alternatives and similar repositories for GLIP-BLIP-Vision-Langauge-Obj-Det-VQA

Users that are interested in GLIP-BLIP-Vision-Langauge-Obj-Det-VQA are comparing it to the libraries listed below

Sorting:

roboflow / cvevals
Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…
☆36Updated last year
alimirzazadeh / SemisupervisedAttention
☆13Updated 2 years ago
purnasai / CLIP_Image_Retrieval
Image/Instance Retrieval using CLIP, A self supervised Learning Model
☆28Updated 2 years ago
autodistill / autodistill-grounded-edgesam
EdgeSAM model for use with Autodistill.
☆27Updated last year
apple / ml-mofi
☆58Updated last year
basilevh / tcow
Tracking through Containers and Occluders in the Wild (CVPR 2023) - Official Implementation
☆41Updated last year
allenai / grit_official
Official repository for the General Robust Image Task (GRIT) Benchmark
☆54Updated 2 years ago
kaiyuyue / nxtp
PyTorch Implementation of Object Recognition as Next Token Prediction [CVPR 2024 Highlight]
☆180Updated last month
visual-layer / visuallayer
Simplify Your Visual Data Ops. Find and visualize issues with your computer vision datasets such as duplicates, anomalies, data leakage, …
☆70Updated last month
facebookresearch / PartDistillation
Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"
☆58Updated last year
kyegomez / PALI
Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"
☆91Updated last year
iKrishneel / detectron2_timm
A simple wrapper library for binding timm models as detectron2 backbones
☆43Updated 2 years ago
autodistill / autodistill-sam-clip
SAM-CLIP module for use with Autodistill.
☆15Updated last year
NVlabs / STL
Official Pytorch Implementation of Self-emerging Token Labeling
☆33Updated last year
kirill-vish / Beyond-INet
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
☆101Updated 9 months ago
allenai / gpv-1
A task-agnostic vision-language architecture as a step towards General Purpose Vision
☆92Updated 3 years ago
jozhang97 / DETA
Detection Transformers with Assignment
☆255Updated last year
lucidrains / MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
☆103Updated last year
NVlabs / RIO
Official PyTorch implementation of RIO
☆18Updated 3 years ago
jacobmarks / zero-shot-prediction-plugin
Run zero-shot prediction models on your data
☆32Updated 6 months ago
LHBuilder / SA-Segment-Anything
Vision-oriented multimodal AI
☆49Updated last year
deep-diver / LoRA-deployment
LoRA fine-tuned Stable Diffusion Deployment
☆31Updated 2 years ago
elsevierlabs-os / clip-image-search
Fine-tuning OpenAI CLIP Model for Image Search on medical images
☆76Updated 3 years ago
kyegomez / PALI3
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
☆145Updated 2 months ago
rohitrango / beyond-map
[CVPR 2023 Highlight] Beyond mAP: Towards better evaluation of instance segmentation
☆27Updated 2 years ago
roboflow / rf100-vl
Code from the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models"
☆67Updated 3 weeks ago
MAEHCM / AET
Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”
☆18Updated 2 years ago
LAION-AI / General-GPT
☆64Updated last year
GewelsJI / MVLT
Masked Vision-Language Transformer in Fashion
☆33Updated last year
gregor-ge / mBLIP
☆87Updated last year