LiWentomng / gradio-osprey-demoLinks
Gradio demo used in our Osprey:Pixel Understanding with Visual Instruction Tuning.
☆15Updated last year
Alternatives and similar repositories for gradio-osprey-demo
Users that are interested in gradio-osprey-demo are comparing it to the libraries listed below
Sorting:
- ☆20Updated 2 years ago
- ☆29Updated last year
- Precision Search through Multi-Style Inputs☆72Updated 2 weeks ago
- Image Editing Anything☆116Updated 2 years ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated 10 months ago
- Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.☆19Updated 3 years ago
- ☆70Updated last year
- Codebase for the Recognize Anything Model (RAM)☆82Updated last year
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆98Updated last year
- minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora☆40Updated last year
- EraseAnything, ICML 2025☆24Updated 2 months ago
- ☆58Updated 2 years ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated 11 months ago
- ☆93Updated last year
- Official PyTorch implementation for TCSVT 23 "Detect Any Shadow: Segment Anything for Video Shadow Detection"☆61Updated 8 months ago
- [NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fi…☆85Updated last year
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆125Updated 9 months ago
- [CVPR Challenge Rank 2nd] The codes and related files to reproduce the results for Video Similarity Challenge Descriptor Track.☆19Updated 4 months ago
- ☆46Updated 7 months ago
- This repository is for the first survey on SAM & SAM2 for Videos.☆52Updated 3 months ago
- ☆189Updated 2 months ago
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆96Updated 9 months ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆49Updated last month
- Simple script to parallelize download and extract files for SA-1B Dataset.☆37Updated last month
- [NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT☆136Updated last year
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆70Updated 2 months ago
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets☆45Updated last week
- [IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation☆125Updated 10 months ago
- VimTS: A Unified Video and Image Text Spotter☆77Updated 9 months ago
- Baby-DALL3: Annotation anything in visual tasks and Generate anything just all in one-pipeline with GPT-4 (a small baby of DALL·E 3).☆83Updated last year