xISSAx / Alpha-Co-Vision
A real-time video caption to conversation bot that captures frames generates captions and creates conversational responses using a Large Language Models base to create interactive video descriptions.
β119Updated last year
Related projects β
Alternatives and complementary repositories for Alpha-Co-Vision
- π The open-source autonomous agent LLM initiative πβ90Updated 8 months ago
- Extract information, summarize, ask questions, and search videos using OpenAI's Vision API ππ¦β61Updated last year
- Maybe the new state of the art vision model? we'll see π€·ββοΈβ153Updated 10 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAIβ222Updated 6 months ago
- βοΈ Zero-Shot Autonomous Robotsβ98Updated 7 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) ποΈβ76Updated last year
- Cerule - A Tiny Mighty Vision Modelβ67Updated 2 months ago
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChainβ43Updated last year
- Build a Streamlit Chatbot using Langchain, ColBERT, Ragatouille, and ChromaDBβ116Updated 9 months ago
- This project breathes life into video characters by using AI to describe their personality and then chat with you as them.β45Updated 7 months ago
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fastβ137Updated 2 months ago
- β37Updated last year
- A voice-enabled chatbot application built using of π¦οΈπ LangChain, text-to-speech, and speech-to-text models from π€ Hugging Face, and β¦β188Updated 11 months ago
- Chat Bot with LLM and Fact Reference. RAG(Retrieval Augmented Generation) and LangChain backedβ128Updated 6 months ago
- Transcribe and summarize videos using whisper and llms on apple mlx frameworkβ70Updated 9 months ago
- Claude API Test Projectβ87Updated 6 months ago
- β103Updated 7 months ago
- β208Updated 10 months ago
- VideoDB Python SDKβ60Updated this week
- Build your Swarm of Internet Agents using MultiOn πβ77Updated 10 months ago
- Fine Tuning Multimodal LLM "Idefics 9B" on Pokemon Go Dataset available on Hugging Face.β16Updated 9 months ago
- Pull high-quality, efficient embeddings for PubMed, arXiv and Wikipedia from Huggingface and use for local LLM inference/Retrieval Augmenβ¦β37Updated 8 months ago
- The Next Generation Multi-Modality Superintelligenceβ70Updated 2 months ago
- Video+code lecture on building nanoGPT from scratchβ64Updated 4 months ago
- Command-line script for inferencing from models such as MPT-7B-Chatβ102Updated last year
- Solving data for LLMs - Create quality synthetic datasets!β136Updated 3 weeks ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradioβ35Updated last year
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β126Updated this week
- Data Questionnaire Agent Chatbotβ61Updated 2 weeks ago
- β188Updated 5 months ago