tapanBabbar9 / computer-vision
Experiments with CV
☆29Updated 4 months ago
Alternatives and similar repositories for computer-vision:
Users that are interested in computer-vision are comparing it to the libraries listed below
- ☆21Updated 5 months ago
- Instantly convert ideas into app code with AI! This React app uses the Gemini API to generate and preview code from Markdown, making prot…☆12Updated last month
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆15Updated 3 months ago
- WhisperAnywhere: Effortless speech-to-text everywhere on your Mac. Use a hotkey to dictate in any app, powered by Whisper AI and Groq API…☆29Updated 7 months ago
- MCP Server implementation for Claude☆24Updated 4 months ago
- A utility for generating conversational podcasts with AI text-to-speech, inspired by Google's NotebookLM.☆19Updated 7 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆23Updated 7 months ago
- ☆29Updated last year
- This repository will guide you to create your Images via Stable Diffusion using a Smart Virtual Assistant like Google Assistant using Ope…☆35Updated 2 years ago
- Deepseek R1 Agent powered by LMStudio and Smolagents☆30Updated 3 months ago
- an auto coder which automatically fixes errors and improves the code from simple user prompt☆38Updated 4 months ago
- 🧠 Mem4AI: A LLM Friendly memory management library.☆24Updated 5 months ago
- On-device LLM Inference using Mediapipe LLM Inference API.☆21Updated last year
- A no-string API framework for deploying schema-based reasoning into third-party apps☆19Updated this week
- ☆13Updated 5 months ago
- ☆16Updated 5 months ago
- Generate video stories with AI ✨☆32Updated 7 months ago
- ☆18Updated last year
- ☆20Updated last year
- Your Python AI Coder!☆33Updated last week
- This little utility library allows you to ask the most common question when working with video content - does the video contain something…☆58Updated 9 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆11Updated 8 months ago
- Groq-Whisper Fast Transcription App built using Groq API and Streamlit.☆24Updated 7 months ago
- Speak (speech-to-text) to LLMs (Ollama) in any lanaguage - Streamlit app☆43Updated last year
- Use this code to access pipeline to Gemini from inside notebookLM☆25Updated last year
- Multimodal LLM Application with PyMuPDF4LLM☆36Updated 6 months ago
- Streamlit application that helps users analyze RFP's using the latest Gemini 2.0 Flash Experimental LLM.☆13Updated 4 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆53Updated 6 months ago
- ☆16Updated last year
- LLM Siri with OpenAI, Perplexity, Ollama, Llama2, Mistral, Mixtral & Langchain☆60Updated last year