ngxson / smolvlm-realtime-webcamLinks
Real-time webcam demo with SmolVLM and llama.cpp server
☆4,834Updated 6 months ago
Alternatives and similar repositories for smolvlm-realtime-webcam
Users that are interested in smolvlm-realtime-webcam are comparing it to the libraries listed below
Sorting:
- A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms☆2,173Updated last week
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆2,946Updated last week
- This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025☆6,906Updated 6 months ago
- OCR model that handles complex tables, forms, handwriting with full layout.☆2,945Updated last week
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,616Updated this week
- A Python package that makes it easy for developers to create AI apps powered by various AI providers.☆1,650Updated 7 months ago
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆1,867Updated last week
- RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal…☆4,477Updated last week
- On-device TTS model by Neuphonic☆4,044Updated last week
- SoTA open-source TTS☆14,677Updated 2 months ago
- Lightweight coding agent that runs in your terminal☆2,153Updated 6 months ago
- Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.☆10,422Updated last month
- A collection of 🤗 Transformers.js demos and example applications☆1,851Updated this week
- ☆2,065Updated 8 months ago
- WhatsApp MCP server☆5,083Updated 4 months ago
- ⚙️ Create and run workflows (RPA 2.0)☆3,793Updated last week
- An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Co…☆5,655Updated 3 weeks ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,387Updated 6 months ago
- Real Time Speech Transcription with FastRTC ⚡️and Local Whisper 🤗☆692Updated 4 months ago
- "AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"☆7,838Updated last month
- 100+ Fine-tuning Tutorial Notebooks on Google Colab, Kaggle and more.☆3,848Updated this week
- TTS with kokoro and onnx runtime☆2,274Updated 5 months ago
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆5,722Updated 3 weeks ago
- ☆512Updated 6 months ago
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,329Updated 7 months ago
- ☆839Updated 6 months ago
- A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.☆14,395Updated this week
- Towards Human-Sounding Speech☆5,756Updated 6 months ago
- Run LLMs with MLX☆2,926Updated last week
- Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your …☆4,571Updated this week