Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.
☆134Aug 7, 2024Updated last year
Alternatives and similar repositories for autodistill-grounded-sam-2
Users that are interested in autodistill-grounded-sam-2 are comparing it to the libraries listed below
Sorting:
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Feb 5, 2024Updated 2 years ago
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆3,306Nov 11, 2025Updated 3 months ago
- Chat with an AI simulation of anyone as easily as copy-pasting text into a folder!☆19Mar 4, 2023Updated 3 years ago
- GroundedSAM Base Model plugin for Autodistill☆55Apr 17, 2024Updated last year
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆37Oct 18, 2023Updated 2 years ago
- ☆14Aug 10, 2025Updated 6 months ago
- A simple demo for utilizing grounding dino and segment anything v2 models together☆21Jul 31, 2024Updated last year
- Aerosol Optical Depth Statistical Analysis☆11Jun 1, 2016Updated 9 years ago
- The sparse Bayesian learning sandbox☆11Jul 4, 2021Updated 4 years ago
- A Docker-based environment for testing the DWPP Nav2 plugin☆23Jan 26, 2026Updated last month
- Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes…☆10Feb 20, 2025Updated last year
- Examples in the MLX framework☆11Sep 23, 2024Updated last year
- ODLabel is a powerful tool for zero-shot object detection, labeling and visualization. It provides an intuitive graphical user interface …☆10May 19, 2024Updated last year
- ☆33Sep 25, 2025Updated 5 months ago
- Unofficial implementation for SOLO instance segmentation☆25Mar 29, 2020Updated 5 years ago
- A Desktop Application to showcase primary OpenCV functions. With OpenCV Catalogue one create a chain of various available OpenCV function…☆11May 10, 2024Updated last year
- Unofficial implementation of Semantic-aware Guidance (S-CFG) for ComfyUI☆12Aug 8, 2024Updated last year
- Mahalanobis Distance-based Multi-view Optimal Transport for Multi-view Crowd Localization, ECCV 2024☆15Nov 20, 2024Updated last year
- ☆29Jul 6, 2022Updated 3 years ago
- Real-time object detection using Florence-2 with a user-friendly GUI.☆31Aug 7, 2025Updated 7 months ago
- ☆31Dec 20, 2022Updated 3 years ago
- [MM 2024 Oral] Refiner for AIGC☆29Jul 29, 2024Updated last year
- A Gradio web UI for Depth-Pro, Sharp Monocular Metric Depth Estimation☆55Oct 7, 2024Updated last year
- Unreal Engine 5 based LiDAR simulation plugin☆65Jan 25, 2024Updated 2 years ago
- ☆12Jan 25, 2023Updated 3 years ago
- Tools for machine learning of animal behavior☆15Mar 8, 2024Updated 2 years ago
- A deep learning-powered visual navigation engine to enables autonomous navigation of pocket-size quadrotor - running on PULP☆13Oct 30, 2024Updated last year
- This ComfyUI node pack allows the user to take a panoramic photo and a corresponding depth map, and turn it into a 3D environment that ca…☆13Mar 29, 2025Updated 11 months ago
- official code for "EgoVSR: Towards High-Quality Egocentric Video Super-Resolution"☆15Jul 26, 2023Updated 2 years ago
- Compare Savant and PyTorch performance☆13Feb 9, 2024Updated 2 years ago
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆1,086Jan 21, 2025Updated last year
- [IEEE PCS 2022 best paper finalist] "FloLPIPS: A Bespoke Video Quality Metric for Frame Interpoation", Duolikun Danier, Fan Zhang, David …☆22Mar 9, 2024Updated 2 years ago
- Diffusion Model for Voice Conversion☆17Oct 11, 2022Updated 3 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Multilingual-Speech-Synthesis-Voice-Conversion Using Bark + RVC☆14Apr 19, 2025Updated 10 months ago
- Images to inference with no labeling (use foundation models to train supervised models).☆2,637May 14, 2025Updated 9 months ago
- A pipeline focused on the in-painting of text in images. For example the removal of subtitles in a screenshot of a movie.☆16Jun 30, 2022Updated 3 years ago
- Open Server is an OpenAI API Compatible Server for generating text, images, embeddings, and storing them in vector databases. It also inc…☆17Dec 8, 2023Updated 2 years ago
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Sep 21, 2023Updated 2 years ago