ymrohit / openscenesense-ollamaLinks
OpenSceneSense Ollama is a Python library that harnesses AI for advanced local video analysis, offering customizable frame and audio insights for dynamic applications in media, education, and content moderation.
☆24Updated 7 months ago
Alternatives and similar repositories for openscenesense-ollama
Users that are interested in openscenesense-ollama are comparing it to the libraries listed below
Sorting:
- Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.☆98Updated last month
- Self-hosted AI medical scribe.☆44Updated last week
- Groq-Whisper Fast Transcription App built using Groq API and Streamlit.☆23Updated 9 months ago
- ☆49Updated this week
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆99Updated 2 months ago
- Turn text from websites into spoken audio with edge-tts, F5, etc. and save as mp3 files☆47Updated this week
- Use smol agents to do research and then update csv coumns with its findings.☆41Updated 4 months ago
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆133Updated last year
- Python client SDK for Ultravox.☆14Updated 2 months ago
- Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.☆26Updated 2 weeks ago
- ☆66Updated last year
- ☆61Updated 3 months ago
- Whisper STT + Orpheus TTS + Gemma 3 using LM Studio to create a virtual assistant.☆61Updated last month
- High level tool use for LLMs☆34Updated 10 months ago
- Faster Whisper with additional features☆44Updated 3 months ago
- Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. …☆37Updated last year
- Developer tools to debug and build realtime voice agents. Supports multiple models.☆45Updated last month
- Screenshot LLM is a Python application that leverages the power of AI to analyze screenshots. Built with PyQt6 for a user-friendly interf…☆42Updated 7 months ago
- ☆29Updated 8 months ago
- Build Phone Calling Voice Agent fully powered by open source models.☆46Updated 2 months ago
- AI-powered tool for automatic podcast script and audio generation.☆72Updated 2 years ago
- Dia-JAX: A JAX port of Dia, the text-to-speech model for generating realistic dialogue from text with emotion and tone control.☆27Updated last month
- Jockey is a conversational video agent.☆81Updated last month
- ☆29Updated last year
- Generate full fledged PDF reports using LLMs like GPT, Claude, Llama☆54Updated last year
- Experimental Python SDK for OpenAI's Realtime API☆43Updated 4 months ago
- OLLama IMage CAtegorizer☆67Updated 5 months ago
- A lightweight recreation of OS1/Samantha from the movie Her, running locally in the browser☆101Updated 2 months ago
- Insanely Fast Transcription: A Python-based utility for rapid audio transcription from YouTube videos or local files. Leverages GPU accel…☆87Updated 11 months ago
- ☆91Updated 4 months ago