tigrisdata-community / multi-modal-starter-kitLinks
Multi-modal starter kit for AI video understanding and narration. Works with Ollama (Llava, bakllava), GPT-4v
☆133Updated 9 months ago
Alternatives and similar repositories for multi-modal-starter-kit
Users that are interested in multi-modal-starter-kit are comparing it to the libraries listed below
Sorting:
- AI agent to automatically check grammar and spelling on documentation files☆88Updated 9 months ago
- Demo of AI chatbot that predicts user message to generate response quickly.☆103Updated last year
- A feed of trending repos/models from GitHub, Replicate, HuggingFace, and Reddit.☆132Updated 3 weeks ago
- ☆125Updated last year
- The Moshi speech-to-speech model, deployed to Modal with a realtime CLI chat☆57Updated 9 months ago
- ☆28Updated 6 months ago
- List of awesome projects powered by fal.ai☆78Updated 10 months ago
- ☆47Updated last year
- ☆99Updated 8 months ago
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…☆59Updated last year
- ActBot is a prototype for an injectable chatbot to give any website agentic capabilities☆58Updated last year
- A spotify playlist agent using CrewAI☆81Updated last year
- a minimalistic template for dynamic self-building AI agents☆97Updated 5 months ago
- Turn text from websites into spoken audio with edge-tts, F5, etc. and save as mp3 files☆47Updated 2 weeks ago
- A couple scripts to grab stats from email☆43Updated 9 months ago
- Record voice notes & transcribe, summarize, and get tasks☆43Updated last year
- auto fine tune of models with synthetic data☆75Updated last year
- Opensource chat app that uses Exa's API for web search and OpenAI o3-mini☆45Updated 2 weeks ago
- AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more☆292Updated 11 months ago
- Safely deploy OpenAI's Realtime APIs in less than 5 minutes!☆156Updated 8 months ago
- ☆41Updated 8 months ago
- napkins.dev – from screenshot to app☆86Updated 8 months ago
- Useful resources for LLM-based Diarization and Transcription.☆55Updated 8 months ago
- 🌸 The open framework for question answering fine-tuning LLMs on private data☆69Updated last year
- The next evolution of Agents☆48Updated 3 weeks ago
- Replicate Flux LoRA image editor.☆51Updated 9 months ago
- Create keyboard shortcuts for an LLM using OpenAI GPT, Ollama, HuggingFace with Automator on macOS.☆151Updated last year
- converts url content into JSON with a simple prefix☆69Updated last year
- A browser extension that demos Gemini Nano via window.ai and Cartesia TTS ⚡️☆39Updated 11 months ago
- Mobile web app for audio "push-to-talk" + TTS chat interface with OpenAI-like APIs☆43Updated last year