saharmor / gemini-multimodal-playgroundLinks
Build realtime voice and video agents with Google's new Gemini 2.0 (API is free for now)
☆319Updated 4 months ago
Alternatives and similar repositories for gemini-multimodal-playground
Users that are interested in gemini-multimodal-playground are comparing it to the libraries listed below
Sorting:
- Assistant for voice-to-blog writing☆147Updated 11 months ago
- PostBot 3000 is an open-source project that shows how to build a powerful AI agent and stream responses and generate artifacts. This proj…☆290Updated last year
- Use OpenAI's realtime API for a chatting with your documents☆330Updated last year
- Realtime API with Firecrawl Tool - Forked from the OpenAI Realtime Console☆161Updated last week
- An opensource implementation of NotebookLM using Deepseek-V3 and PlayHT TTS.☆294Updated last year
- SearchGPT / Perplexity Pages clone, but personalised for you.☆247Updated last year
- Voice-Enabled Math Tutor Powered by Groq that Calculates and Renders Live Problems and Instruction with LaTeX in Seconds!☆239Updated 2 weeks ago
- podcastfy.ai gradio demo app☆332Updated last year
- AI Meeting Minutes analysis App built with NextJS, Langflow, Groq, and OpenAI☆494Updated last year
- Gemini Multimodal Live + WebRTC in a single `app.ts`☆212Updated 3 months ago
- A NextJS/Langflow based app that takes a PDF and converts it into a podcast.☆226Updated last year
- Building Blocks for Multi-Modal Gradio Powered by Groq Apps☆115Updated last year
- ☆253Updated 11 months ago
- A very quick project that transforms research papers into engaging three-person discussions, offering an intuitive and thought-provoking …☆602Updated last year
- An examples code to make langchain agents without openai API key (Google Gemini), Completely free unlimited and open source, run it yours…☆315Updated last year
- ReActMCP is a reactive MCP client that empowers AI assistants to instantly respond with real-time, Markdown-formatted web search insights…☆141Updated 9 months ago
- A Chrome extension for asking questions over websites☆356Updated 11 months ago
- 🔥 Generate llms.txt and llms-full.txt files for any website!☆508Updated 7 months ago
- ☆263Updated last year
- A powerful Python tool that leverages Claude 3.5 Sonnet Vision API to detect and visualize objects in images. The script automatically dr…☆221Updated last year
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆309Updated 5 months ago
- Oliva Multi-Agent Assistant☆385Updated 9 months ago
- Multi-agent that helps you organize and write documents.☆348Updated last year
- YT Navigator: AI-powered YouTube content explorer that lets you search and chat with channel videos using AI agents. Extract insights fro…☆561Updated 9 months ago
- An implementation of a computer use agent (CUA) using LangGraph☆195Updated 9 months ago
- ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3☆500Updated 5 months ago
- Serverless Modal + FastAPI + React + ColPali + Qdrant + GPT4o Vision RAG (V-RAG) Demo☆405Updated 6 months ago
- 🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.☆643Updated last week
- Turn local files into a prompt for an LLM☆177Updated 11 months ago
- The Open Deep Research app – generate reports with OSS LLMs☆314Updated last month