stephansturges / WALDO
Whereabouts Ascertainment for Low-lying Detectable Objects. The SOTA in FOSS AI for drones!
☆1,496Updated 2 months ago
Alternatives and similar repositories for WALDO:
Users that are interested in WALDO are comparing it to the libraries listed below
- Turn any computer or edge device into a command center for your computer vision projects.☆1,558Updated this week
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,436Updated this week
- Neural Autonomous Navigation Observer is a set of very small DNNs for drones to detect a few simple objects☆167Updated last year
- OpenCV+YOLO+LLAVA powered video surveillance system☆745Updated 2 weeks ago
- Images to inference with no labeling (use foundation models to train supervised models).☆2,160Updated 3 months ago
- The official Roboflow Python package. Manage your datasets, models, and deployments. Roboflow has everything you need to build a computer…☆360Updated this week
- Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision models☆111Updated last week
- Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders (CVPR 2025)☆516Updated last week
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"☆6,599Updated 3 weeks ago
- This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.☆1,044Updated last month
- Local realtime voice AI☆2,254Updated last week
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,242Updated last month
- A fast multimodal LLM for real-time voice☆3,705Updated last month
- An open-source computer vision framework to build and deploy apps in minutes☆743Updated 10 months ago
- Aura is like Siri, but in your browser. An AI voice assistant optimized for low latency responses.☆1,203Updated 3 months ago
- This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]☆705Updated 8 months ago
- Create your custom OpenCV algorithms using a user-friendly node editor interface, inspired by Blender and Unreal Engine blueprints! Quic…☆352Updated last week
- 4M: Massively Multimodal Masked Modeling☆1,693Updated this week
- Build reliable customer facing agents with foundational LLMs using behavioral guidelines and runtime supervision☆1,676Updated this week
- A Kubernetes deployable instance of GroundX for document parsing, storage, and search.☆559Updated this week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆844Updated last month
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆103Updated 5 months ago
- Document to Markdown OCR library with Llama 3.2 vision☆2,207Updated last month
- Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer…☆1,424Updated this week
- Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥☆1,678Updated last month
- CoTracker is a model for tracking any point (pixel) on a video.☆4,167Updated last month
- Tracking Anything in High Quality☆748Updated last year
- High-resolution models for human tasks.☆4,878Updated 3 months ago