hasibzunair / peekaboo2Links
Official code for PEEKABOO2: Adapting Peekaboo with Segment Anything Model for Unsupervised Object Localization in Images and Videos.
β27Updated this week
Alternatives and similar repositories for peekaboo2
Users that are interested in peekaboo2 are comparing it to the libraries listed below
Sorting:
- Inference and fine-tuning examples for vision models from π€ Transformersβ162Updated 3 months ago
- Let's bake an image.β15Updated 2 months ago
- Using the moondream VLM with optical flow for promptable object trackingβ71Updated 8 months ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vβ¦β125Updated 5 months ago
- EyeTrax β webcam-based eye tracking made simpleβ202Updated last month
- β59Updated 7 months ago
- Practical Python exercises on classical computer vision and clean engineering practicesβ22Updated 6 months ago
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.β90Updated 2 weeks ago
- β34Updated last year
- Lightweight, open-source, high-performance Yolo implementationβ49Updated 5 months ago
- Each week I create sketches covering key Computer Vision concepts. If you want to learn more about CV stick around!β149Updated 2 years ago
- Mapping ping with a simple script and Ordinary Kriging to interpolate sparse measurements into a nice visualization!β79Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β84Updated last year
- Running a LLM on the ESP32β79Updated last year
- Create topological graph for image segments.β22Updated last year
- Paper Piano uses Python and OpenCV to detect key presses on a hand-drawn piano, translating them into digital notes and sound.β42Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.β67Updated last year
- Solving Computer Vision with AI agentsβ33Updated 4 months ago
- an optimized, production-ready implementation of active speaker detectionβ72Updated last year
- β77Updated last month
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorchβ102Updated 10 months ago
- Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision modelsβ129Updated last month
- An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.β29Updated 8 months ago
- CV video player is a Python-based customizable video player that helps computer vision practitioners to develop, analyze, and debug theirβ¦β24Updated 10 months ago
- β67Updated last year
- β68Updated 8 months ago
- [CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"β616Updated 6 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of modelsβ274Updated 3 months ago
- β105Updated 4 months ago
- β95Updated 10 months ago