sharad5 / OWL-ViT-Object-DetectionLinks
Capstone Project: Training and Finetuning for OWL ViT for Referring Expression Task
☆12Updated 2 years ago
Alternatives and similar repositories for OWL-ViT-Object-Detection
Users that are interested in OWL-ViT-Object-Detection are comparing it to the libraries listed below
Sorting:
- ☆54Updated 2 years ago
- Auto Segmentation label generation with SAM (Segment Anything) + Grounding DINO☆22Updated 11 months ago
- ☆18Updated last year
- Detectron2 Toolbox and Benchmark for V3Det☆18Updated last year
- GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)☆72Updated 2 years ago
- Offical implementation of "Confidence-Calibrated Face and Kinship Verification" (T-IFS 2023)☆24Updated 2 years ago
- Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.☆20Updated 3 years ago
- ☆10Updated 2 years ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆63Updated last year
- LP-OVOD: Open-Vocabulary Object Detection by Linear Probing (WACV 2024)☆29Updated last year
- ☆13Updated 2 years ago
- object detection based on owl-vit☆67Updated 2 years ago
- Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)☆34Updated 10 months ago
- In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Pro…☆11Updated 4 years ago
- [ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting☆122Updated last year
- ☆20Updated 6 months ago
- Using image captions with LLM for zero-shot VQA☆18Updated last year
- Official PyTorch implementation of the paper "Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner"☆15Updated 2 years ago
- FInetuning CLIP for Few Shot Learning☆46Updated 4 years ago
- Paper list of compositional zero-shot learning☆11Updated 3 years ago
- 【AIGC 实战入门笔记 —— AIGC 摩天大楼】分享 大语言模型(LLMs),大模型高效微调(SFT),检索增强生成(RAG),智能体(Agent),PPT自动生成, 角色扮演,文生图(Stable Diffusion) ,图像文字识别(OCR),语音识别(ASR),语…☆54Updated 9 months ago
- Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖☆47Updated last year
- [CSCWD] Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead.☆129Updated 10 months ago
- Faysal-MD / Unmasking-Deepfake-Faces-from-Videos-An-Explainable-Cost-Sensitive-Deep-Learning-Approach-IEEE2023Deepfake faces detection from forged videos where used explainable AI for models' robustness as well as cost sensitive methods for mitiga…☆10Updated last year
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Updated last year
- The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".☆253Updated last year
- Vision-oriented multimodal AI☆49Updated last year
- A project using YoloV8 to detect License Plates☆12Updated 2 years ago
- Faceprecision is a comprehensive face analysis project leveraging advanced deep learning and computer vision techniques. This project inc…☆14Updated last year
- Scripts, data and researches related to cow weight and breed prediction☆13Updated 5 months ago