huggingface / segment-anything-2Links
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
☆71Updated 7 months ago
Alternatives and similar repositories for segment-anything-2
Users that are interested in segment-anything-2 are comparing it to the libraries listed below
Sorting:
- ☆345Updated 8 months ago
- mlx image models for Apple Silicon machines☆80Updated last month
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated last year
- SmolVLM2 Demo☆150Updated 2 months ago
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integrat…☆64Updated 8 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆165Updated last week
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 10 months ago
- Notebooks to demonstrate TimmWrapper☆16Updated 4 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated 6 months ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Updated 8 months ago
- A Gradio web UI for Depth-Pro, Sharp Monocular Metric Depth Estimation☆49Updated 7 months ago
- Real-time pose estimation pipeline with 🤗 Transformers☆59Updated 3 months ago
- Open-source and reproducible benchmarks for Speaker Diarization☆25Updated last month
- ☆58Updated last year
- ☆47Updated last year
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆69Updated 8 months ago
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆57Updated last year
- ☆30Updated last month
- A collection of optimizers for MLX☆36Updated this week
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated 3 months ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆36Updated last year
- ☆29Updated last month
- run embeddings in MLX☆89Updated 8 months ago
- Parameter-efficient finetuning script for Phi-3-vision, the strong multimodal language model by Microsoft.☆58Updated 11 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆95Updated 5 months ago
- MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…☆65Updated 6 months ago
- Tools for merging pretrained large language models.☆19Updated 11 months ago
- Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which includ…☆33Updated 5 months ago
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆69Updated last year
- ☆92Updated 2 months ago