yeungchenwa / OCR-SAM
Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting
☆517Updated 7 months ago
Related projects: ⓘ
- [ECCV 2024] Tokenize Anything via Prompting☆502Updated 2 months ago
- This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.☆676Updated 11 months ago
- Simple static web-based mask drawer, supporting semantic segmentation and video segmentation with interactive Segment Anything Model 2 (S…☆352Updated last month
- On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)☆435Updated 3 weeks ago
- Segment Anything combined with CLIP☆328Updated 7 months ago
- Grounded Segment Anything: From Objects to Parts☆383Updated last year
- [arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation☆188Updated 2 months ago
- The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++:…☆240Updated last month
- Segment-anything related awesome extensions/projects/repos.☆340Updated last year
- Experiment on combining CLIP with SAM to do open-vocabulary image segmentation.☆331Updated last year
- [ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"☆637Updated 7 months ago
- Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexib…☆601Updated 10 months ago
- [CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language☆1,281Updated 11 months ago
- [ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"☆2,255Updated 2 months ago
- Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scena…☆754Updated last year
- A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text remova…☆186Updated last month
- [Image and Vision Computing (Vol.147 Jul. '24)] Interactive Natural Image Matting with Segment Anything Models☆468Updated 3 months ago
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"☆364Updated 5 months ago
- This is an implementation of zero-shot instance segmentation using Segment Anything.☆295Updated last year
- [CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want☆640Updated last month
- [A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.☆785Updated last year
- Combining Segment Anything (SAM) with Grounded DINO for zero-shot object detection and CLIPSeg for zero-shot segmentation☆365Updated 4 months ago
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆710Updated 2 weeks ago
- Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.☆439Updated last year
- [CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks☆291Updated 3 weeks ago
- API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆707Updated last month
- [CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segme…☆1,153Updated 9 months ago
- [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…☆742Updated 3 months ago
- A collection of project, papers, and source code for Meta AI's Segment Anything Model (SAM) and related studies.☆314Updated this week
- MetaSeg: Packaged version of the Segment Anything repository☆945Updated this week