Shohruh72 / HRNet-LandmarksLinks
☆19Updated 4 months ago
Alternatives and similar repositories for HRNet-Landmarks
Users that are interested in HRNet-Landmarks are comparing it to the libraries listed below
Sorting:
- Securade.ai HUB - A generative AI based edge platform for computer vision that connects to existing CCTV cameras and makes them smart.☆229Updated 3 months ago
- Setting up Vscode to work with Pytorch in C/C++ with CUDA support☆25Updated 8 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 9 months ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆124Updated 4 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆84Updated last year
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.☆90Updated last week
- ☆102Updated last year
- Inference and fine-tuning examples for vision models from 🤗 Transformers☆162Updated 2 months ago
- A tool for converting computer vision label formats.☆73Updated 2 weeks ago
- 6D Rotation Representation for Unconstrained Head Pose Estimation☆15Updated 2 months ago
- MBASE, an LLM SDK in C++☆55Updated 3 months ago
- World's Smallest Vision-Language Model☆29Updated last year
- ☆44Updated last month
- SmolVLM Action Recognition☆20Updated 5 months ago
- OmniFusion — a multimodal model to communicate using text and images☆233Updated last year
- Hands-On Learning in Computer Vision☆26Updated this week
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆193Updated 2 weeks ago
- ☆101Updated 7 months ago
- 2D Positional Embeddings for Webpage Structural Understanding 🦙👀☆94Updated last year
- Top ML papers of the week.☆41Updated this week
- Self-host LLMs with vLLM and BentoML☆151Updated last week
- ☆54Updated last week
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 11 months ago
- Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023☆46Updated last year
- An AI Vision Language Model System for extracting structured knowledge graph information(JSON) from images of process diagrams☆31Updated 6 months ago
- Computer Vision projects☆22Updated this week
- This is the repo for the paper "PANGEA: A FULLY OPEN MULTILINGUAL MULTIMODAL LLM FOR 39 LANGUAGES"☆113Updated 3 months ago
- ☆86Updated last year
- Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which includ…☆34Updated 9 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆271Updated 3 months ago