ngxson / smolvlm-realtime-webcamLinks
Real-time webcam demo with SmolVLM and llama.cpp server
☆4,777Updated 5 months ago
Alternatives and similar repositories for smolvlm-realtime-webcam
Users that are interested in smolvlm-realtime-webcam are comparing it to the libraries listed below
Sorting:
- 100+ Fine-tuning LLM Notebooks on Google Colab, Kaggle, and more.☆3,725Updated this week
- Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your …☆4,417Updated this week
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,297Updated 6 months ago
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,436Updated 3 weeks ago
- ☆1,743Updated last week
- The python library for real-time communication☆4,335Updated 3 weeks ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,328Updated 5 months ago
- https://hf.co/hexgrad/Kokoro-82M☆4,530Updated 2 months ago
- SOTA search powered LLM☆3,676Updated 6 months ago
- ☆1,273Updated this week
- This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025☆6,746Updated 5 months ago
- Real Time Speech Transcription with FastRTC ⚡️and Local Whisper 🤗☆684Updated 3 months ago
- Lightweight coding agent that runs in your terminal☆2,113Updated 5 months ago
- An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Co…☆5,496Updated this week
- A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms☆2,146Updated last week
- Transform PDFs into AI podcasts for engaging on-the-go audio content.☆749Updated 4 months ago
- RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO and designed for fine-tun…☆3,582Updated last week
- computer vision and sports☆4,656Updated 2 months ago
- Towards Human-Sounding Speech☆5,617Updated 5 months ago
- On-device TTS model by Neuphonic☆2,614Updated this week
- State-of-the-art TTS model under 25MB 😻☆8,881Updated last month
- Make text LLMs listen and speak☆910Updated this week
- Tool for generating high quality Synthetic datasets☆1,255Updated 2 weeks ago
- Have a natural, spoken conversation with AI!☆3,245Updated 3 months ago
- Paper2Agent is a multi-agent AI system that automatically transforms research papers into interactive AI agents with minimal human input.☆1,349Updated 3 weeks ago
- A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.☆14,133Updated last week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,903Updated last month
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆4,963Updated this week
- A Conversational Speech Generation Model☆14,154Updated 4 months ago
- The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.☆7,336Updated 2 weeks ago