Labellerr / Hands-On-Learning-in-Computer-VisionLinks
Hands-On Learning in Computer Vision
☆26Updated this week
Alternatives and similar repositories for Hands-On-Learning-in-Computer-Vision
Users that are interested in Hands-On-Learning-in-Computer-Vision are comparing it to the libraries listed below
Sorting:
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆12Updated last year
- Eye exploration☆29Updated 8 months ago
- Daily.co + Pipecat + Tavus AI Avatar Agent☆14Updated 6 months ago
- 6D Rotation Representation for Unconstrained Head Pose Estimation☆15Updated 2 months ago
- This Repository demostrates various examples using YOLO☆13Updated last year
- Real-Time Open-Vocabulary Object Detection☆12Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆67Updated last year
- ☆21Updated 11 months ago
- This repository contains a Multimodal Retrieval-Augmented Generation (RAG) Pipeline that integrates images, audio, and text for advanced …☆24Updated 9 months ago
- ☆16Updated last year
- ☆17Updated last year
- ☆21Updated 8 months ago
- Flask-based web application designed to compare text and image embeddings using the CLIP model.☆22Updated last year
- EdgeSAM model for use with Autodistill.☆29Updated last year
- ☆47Updated last year
- Retrieval-augmented generation (RAG) for remote & local LLM use☆45Updated 4 months ago
- An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.☆29Updated 7 months ago
- ☆16Updated 4 months ago
- ☆21Updated last year
- AI Search engine☆12Updated 3 weeks ago
- ☆54Updated last week
- ☆29Updated last year
- OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space mod…☆14Updated last week
- Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.☆17Updated last year
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆37Updated 2 years ago
- Inference and fine-tuning examples for vision models from 🤗 Transformers☆162Updated 2 months ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆124Updated 4 months ago
- 100 Days of GPU Challenge☆23Updated last month
- ☆17Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 9 months ago