LiWentomng / gradio-osprey-demoLinks
Gradio demo used in our Osprey:Pixel Understanding with Visual Instruction Tuning.
☆15Updated last year
Alternatives and similar repositories for gradio-osprey-demo
Users that are interested in gradio-osprey-demo are comparing it to the libraries listed below
Sorting:
- ☆20Updated last year
- Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.☆19Updated 3 years ago
- ☆58Updated last year
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated 9 months ago
- A Simple Framework of Small-scale LMMs for Video Understanding☆65Updated 2 weeks ago
- ☆27Updated 7 months ago
- minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora☆40Updated last year
- ☆19Updated 2 years ago
- Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆33Updated this week
- ☆29Updated 11 months ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆46Updated 3 months ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated 9 months ago
- ☆45Updated 6 months ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆98Updated 11 months ago
- ☆34Updated last year
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 10 months ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆51Updated 3 months ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆54Updated 3 months ago
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Updated last year
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Updated 8 months ago
- ☆32Updated last year
- Official Implementation of ICCV 2023 Paper - SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning☆110Updated last month
- ☆62Updated last month
- Precision Search through Multi-Style Inputs☆70Updated 2 months ago
- Official code for paper "GRIT: Teaching MLLMs to Think with Images"☆98Updated this week
- [TMLR] Official PyTorch implementation of "λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent…☆51Updated 6 months ago
- ☆37Updated last month
- Concept Lancet: Image Editing with Compositional Representation Transplant (CVPR 2025)☆15Updated 3 months ago
- ☆40Updated 5 months ago
- 🏞️ Official implementation of "Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition"☆107Updated last year