Doriandarko / Claude-Vision-Object-DetectionLinks
A powerful Python tool that leverages Claude 3.5 Sonnet Vision API to detect and visualize objects in images. The script automatically draws bounding boxes around detected objects, labels them, and displays confidence scores.
☆220Updated last year
Alternatives and similar repositories for Claude-Vision-Object-Detection
Users that are interested in Claude-Vision-Object-Detection are comparing it to the libraries listed below
Sorting:
- Realtime API with Firecrawl Tool - Forked from the OpenAI Realtime Console☆160Updated this week
- ☆252Updated 11 months ago
- mind map generator☆72Updated last year
- Use OpenAI's realtime API for a chatting with your documents☆331Updated last year
- Gemini Multimodal Live + WebRTC in a single `app.ts`☆212Updated 2 months ago
- podcastfy.ai gradio demo app☆334Updated last year
- 🔥 Generate llms.txt and llms-full.txt files for any website!☆504Updated 6 months ago
- SearchGPT / Perplexity Pages clone, but personalised for you.☆247Updated last year
- A NextJS/Langflow based app that takes a PDF and converts it into a podcast.☆227Updated last year
- Voice-Enabled Math Tutor Powered by Groq that Calculates and Renders Live Problems and Instruction with LaTeX in Seconds!☆237Updated last week
- openperplex is an opensource AI search engine☆171Updated last year
- napkins.dev – from screenshot to app☆86Updated last year
- A cool AI Diagram generator from a given topic, that streams the partial diagrams from the incomplete JSONs during generation. Built usin…☆217Updated last year
- ☆136Updated 11 months ago
- The AI assistant for computer control.☆329Updated last year
- A powerful Python tool for performing technical searches using the Perplexity API, optimized for retrieving precise facts, code examples,…☆212Updated 11 months ago
- Convert PowerPoint files into semantically rich text using vision language models☆110Updated last month
- Building Blocks for Multi-Modal Gradio Powered by Groq Apps☆115Updated last year
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard though☆566Updated last month
- Turn local files into a prompt for an LLM☆177Updated 11 months ago
- An amazon fresh mcp server☆62Updated last year
- ☆158Updated 3 weeks ago
- deep seek & o1 auto coders which write python code from a simple description and iteratively improvesit and fix errors☆96Updated 11 months ago
- NotebookLlama powered by Groq - Create podcasts on any topic lightning fast☆78Updated last year
- PostBot 3000 is an open-source project that shows how to build a powerful AI agent and stream responses and generate artifacts. This proj…☆290Updated last year
- An opensource implementation of NotebookLM using Deepseek-V3 and PlayHT TTS.☆295Updated last year
- Build realtime voice and video agents with Google's new Gemini 2.0 (API is free for now)☆320Updated 3 months ago
- The Open Deep Research app – generate reports with OSS LLMs☆314Updated 3 weeks ago
- Youtube API Server used in https://git.new/scira☆344Updated 5 months ago
- Chat Application Starter Kit — Gemini Multimodal Live API + Pipecat☆224Updated 2 months ago