Doriandarko / Claude-Vision-Object-DetectionLinks
A powerful Python tool that leverages Claude 3.5 Sonnet Vision API to detect and visualize objects in images. The script automatically draws bounding boxes around detected objects, labels them, and displays confidence scores.
☆215Updated 11 months ago
Alternatives and similar repositories for Claude-Vision-Object-Detection
Users that are interested in Claude-Vision-Object-Detection are comparing it to the libraries listed below
Sorting:
- ☆249Updated 8 months ago
- Realtime API with Firecrawl Tool - Forked from the OpenAI Realtime Console☆159Updated last year
- podcastfy.ai gradio demo app☆335Updated 10 months ago
- Gemini Multimodal Live + WebRTC in a single `app.ts`☆209Updated last week
- A NextJS/Langflow based app that takes a PDF and converts it into a podcast.☆225Updated 11 months ago
- mind map generator☆73Updated 10 months ago
- 🔥 Generate llms.txt and llms-full.txt files for any website!☆471Updated 4 months ago
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard though☆557Updated 4 months ago
- Use OpenAI's realtime API for a chatting with your documents☆330Updated last year
- Voice-Enabled Math Tutor Powered by Groq that Calculates and Renders Live Problems and Instruction with LaTeX in Seconds!☆236Updated 9 months ago
- napkins.dev – from screenshot to app☆86Updated last year
- The AI assistant for computer control.☆319Updated last year
- A powerful Python tool for performing technical searches using the Perplexity API, optimized for retrieving precise facts, code examples,…☆210Updated 9 months ago
- openperplex is an opensource AI search engine☆171Updated last year
- PostBot 3000 is an open-source project that shows how to build a powerful AI agent and stream responses and generate artifacts. This proj…☆288Updated 10 months ago
- SearchGPT / Perplexity Pages clone, but personalised for you.☆245Updated last year
- An amazon fresh mcp server☆64Updated 10 months ago
- ☆154Updated this week
- The Open Deep Research app – generate reports with OSS LLMs☆302Updated 3 months ago
- Convert PowerPoint files into semantically rich text using vision language models☆107Updated 7 months ago
- deep seek & o1 auto coders which write python code from a simple description and iteratively improvesit and fix errors☆95Updated 9 months ago
- Building Blocks for Multi-Modal Gradio Powered by Groq Apps☆113Updated 11 months ago
- A Chrome extension for asking questions over websites☆349Updated 8 months ago
- Realtime Voice and Vision wtih Brilliant Labs Frame and Gemini☆66Updated 5 months ago
- A cool AI Diagram generator from a given topic, that streams the partial diagrams from the incomplete JSONs during generation. Built usin…☆214Updated last year
- This repository contains the code for a virtual try-on application built using Flask, Twilio's WhatsApp API, and Gradio's virtual try-on …☆348Updated last year
- ☆137Updated 8 months ago
- Hallucination Detector is a free and open-source tool that helps you verify the accuracy of your LLM generated content instantly.☆289Updated 4 months ago
- Chat Application Starter Kit — Gemini Multimodal Live API + Pipecat☆221Updated last week
- ☆188Updated 10 months ago