Doriandarko / Claude-Vision-Object-DetectionLinks
A powerful Python tool that leverages Claude 3.5 Sonnet Vision API to detect and visualize objects in images. The script automatically draws bounding boxes around detected objects, labels them, and displays confidence scores.
☆217Updated last year
Alternatives and similar repositories for Claude-Vision-Object-Detection
Users that are interested in Claude-Vision-Object-Detection are comparing it to the libraries listed below
Sorting:
- Realtime API with Firecrawl Tool - Forked from the OpenAI Realtime Console☆159Updated last year
- ☆249Updated 9 months ago
- mind map generator☆72Updated 10 months ago
- podcastfy.ai gradio demo app☆334Updated 11 months ago
- Gemini Multimodal Live + WebRTC in a single `app.ts`☆210Updated 3 weeks ago
- 🔥 Generate llms.txt and llms-full.txt files for any website!☆479Updated 4 months ago
- Use OpenAI's realtime API for a chatting with your documents☆330Updated last year
- A NextJS/Langflow based app that takes a PDF and converts it into a podcast.☆225Updated last year
- napkins.dev – from screenshot to app☆86Updated last year
- openperplex is an opensource AI search engine☆171Updated last year
- SearchGPT / Perplexity Pages clone, but personalised for you.☆245Updated last year
- Voice-Enabled Math Tutor Powered by Groq that Calculates and Renders Live Problems and Instruction with LaTeX in Seconds!☆236Updated 10 months ago
- ☆154Updated 3 weeks ago
- Turn local files into a prompt for an LLM☆177Updated 9 months ago
- The AI assistant for computer control.☆320Updated last year
- ☆137Updated 9 months ago
- A powerful Python tool for performing technical searches using the Perplexity API, optimized for retrieving precise facts, code examples,…☆208Updated 9 months ago
- An amazon fresh mcp server☆63Updated 11 months ago
- A cool AI Diagram generator from a given topic, that streams the partial diagrams from the incomplete JSONs during generation. Built usin…☆214Updated last year
- PostBot 3000 is an open-source project that shows how to build a powerful AI agent and stream responses and generate artifacts. This proj…☆288Updated 11 months ago
- Convert PowerPoint files into semantically rich text using vision language models☆107Updated this week
- The Open Deep Research app – generate reports with OSS LLMs☆303Updated this week
- ☆193Updated 3 weeks ago
- deep seek & o1 auto coders which write python code from a simple description and iteratively improvesit and fix errors☆95Updated 9 months ago
- Assistant for voice-to-blog writing☆145Updated 9 months ago
- Building Blocks for Multi-Modal Gradio Powered by Groq Apps☆114Updated last year
- Chat Application Starter Kit — Gemini Multimodal Live API + Pipecat☆221Updated 3 weeks ago
- ☆188Updated 11 months ago
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard though☆558Updated 3 weeks ago
- Chrome extension that interacts with content using Groq☆41Updated 10 months ago