rorro6787 / img-desc-visually-impairedLinks
Image description System for Impaired people
☆15Updated 11 months ago
Alternatives and similar repositories for img-desc-visually-impaired
Users that are interested in img-desc-visually-impaired are comparing it to the libraries listed below
Sorting:
- This repo gives a start for the docker.☆35Updated last year
- ☆26Updated last year
- Simple CogVLM client script☆14Updated 2 years ago
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆68Updated last year
- Eye exploration☆31Updated last month
- ☆47Updated last year
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆69Updated last year
- EdgeSAM model for use with Autodistill.☆29Updated last year
- ☆29Updated 2 years ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆85Updated last year
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆12Updated last year
- ☆62Updated 2 years ago
- This project breathes life into video characters by using AI to describe their personality and then chat with you as them.☆49Updated last year
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆125Updated 7 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆88Updated 2 years ago
- The Facial Landmark Detection☆15Updated 5 months ago
- ☆22Updated last year
- ☆38Updated last year
- ☆25Updated 2 years ago
- 🎮Manipulates mobile phones just like how you would. Official code for "MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficien…☆26Updated 3 months ago
- Real-time object detection using Florence-2 with a user-friendly GUI.☆30Updated 5 months ago
- Run Vision LLMs, TTS and STT APIs. Website and API for https://text-generator.io☆39Updated this week
- Automatic Thief Detection via CCTV with Alarm System and Perpetrator Image Capture using YOLOv5 + ROI. This project utilizes computer vis…☆14Updated last year
- 6D Rotation Representation for Unconstrained Head Pose Estimation☆17Updated 5 months ago
- Supporting code for: Video Enriched Retrieval Augmented Generation Using Aligned Video Captions☆32Updated last year
- Vehicle speed estimation using YOLOv8☆32Updated last year
- ☆16Updated last year
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆64Updated 8 months ago
- ☆11Updated last year
- 2nd place solution for the Generative Interior Design 2024 competition☆126Updated last year