ngxson / smolvlm-realtime-webcamLinks
Real-time webcam demo with SmolVLM and llama.cpp server
☆4,594Updated 3 months ago
Alternatives and similar repositories for smolvlm-realtime-webcam
Users that are interested in smolvlm-realtime-webcam are comparing it to the libraries listed below
Sorting:
- This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025☆5,717Updated 3 months ago
- OmniGen2: Exploration to Advanced Multimodal Generation.☆3,771Updated last month
- A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms☆2,083Updated this week
- Frontier Open-Source Text-to-Speech☆7,080Updated this week
- Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your …☆4,164Updated this week
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,259Updated 4 months ago
- 🎨 Turn your roughest sketches into stunning 3D worlds by vibe drawing☆1,934Updated last month
- mcp-use is the easiest way to interact with mcp servers with custom agents☆7,082Updated last week
- Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI.☆3,264Updated this week
- A mini, open-weights, version of our Proxy assistant.☆955Updated 6 months ago
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆2,616Updated last week
- Fully local web research and report writing assistant☆8,064Updated 3 weeks ago
- SoTA open-source TTS☆11,210Updated last month
- Lightweight coding agent that runs in your terminal☆1,950Updated 3 months ago
- Agent S: an open agentic framework that uses computers like a human☆6,190Updated last week
- RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.☆2,872Updated last week
- Towards Human-Sounding Speech☆5,458Updated 3 months ago
- A Python package that makes it easy for developers to create AI apps powered by various AI providers.☆1,646Updated 4 months ago
- The python library for real-time communication☆4,249Updated last week
- 100+ Fine-tuning LLM Notebooks on Google Colab, Kaggle, and more.☆3,523Updated this week
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,305Updated last week
- Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI O…☆8,879Updated this week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,255Updated 4 months ago
- A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive vi…☆14,065Updated this week
- ☆13,032Updated last week
- ☆3,525Updated 4 months ago
- The AI Browser Automation Framework☆16,707Updated this week
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆3,914Updated last week
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆18,154Updated last month
- ☆2,699Updated 4 months ago