ngxson / smolvlm-realtime-webcamLinks

Real-time webcam demo with SmolVLM and llama.cpp server

☆3,969

Alternatives and similar repositories for smolvlm-realtime-webcam

Users that are interested in smolvlm-realtime-webcam are comparing it to the libraries listed below

Sorting:

apple / ml-fastvlm
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
☆4,233Updated last month
roboflow / trackers
A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms
☆1,774Updated this week
ymichael / open-codex
Lightweight coding agent that runs in your terminal
☆1,869Updated last month
transformerlab / transformerlab-app
Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer…
☆3,454Updated this week
nickscamara / open-deep-research
An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl
☆5,759Updated last month
morphik-org / morphik-core
Open source multi-modal RAG for building AI apps over private knowledge.
☆2,696Updated this week
Blaizzy / mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…
☆2,420Updated 2 weeks ago
PySpur-Dev / pyspur
A visual playground for agentic workflows: Iterate over your agents 10x faster
☆5,176Updated last month
browser-use / workflow-use
⚙️ Create and run workflows (RPA 2.0)
☆3,362Updated last week
unslothai / notebooks
Fine-tune LLMs for free with 100+ Notebooks on Google Colab, Kaggle, and more.
☆2,394Updated this week
NirDiamant / agents-towards-production
This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark …
☆5,611Updated this week
Nutlope / llama-ocr
Document to Markdown OCR library with Llama 3.2 vision
☆2,345Updated 5 months ago
kyutai-labs / hibiki
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…
☆1,125Updated 2 months ago
huggingface / transformers.js-examples
A collection of 🤗 Transformers.js demos and example applications
☆1,615Updated 3 weeks ago
mcp-use / mcp-use
mcp-use is the easiest way to interact with mcp servers with custom agents
☆3,920Updated this week
sentient-agi / OpenDeepSearch
☆3,410Updated 2 months ago
google-gemini / live-api-web-console
A react-based starter app for using the Live API over websockets with Gemini
☆2,212Updated last month
reflex-dev / reflex-llm-examples
☆823Updated last month
tjmlabs / ColiVara
Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…
☆1,138Updated last month
idosal / git-mcp
Put an end to code hallucinations! GitMCP is a free, open-source, remote MCP server for any GitHub project
☆3,145Updated last month
AK391 / ai-gradio
A Python package that makes it easy for developers to create AI apps powered by various AI providers.
☆1,621Updated 2 months ago
simular-ai / Agent-S
Agent S: an open agentic framework that uses computers like a human
☆5,560Updated last week
resemble-ai / chatterbox
SoTA open-source TTS
☆8,784Updated 2 weeks ago
rowboatlabs / rowboat
AI-powered multi-agent builder
☆3,246Updated this week
roboflow / maestro
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
☆2,578Updated this week
Olow304 / memvid
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
☆7,978Updated 3 weeks ago
guy-hartstein / company-research-agent
An agentic company research tool powered by LangGraph and Tavily that conducts deep diligence on companies using a multi-agent framework.…
☆1,239Updated last week
bytedance / Dolphin
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
☆3,254Updated this week
gradio-app / fastrtc
The python library for real-time communication
☆4,044Updated 2 weeks ago
NVIDIA-AI-Blueprints / pdf-to-podcast
Transform PDFs into AI podcasts for engaging on-the-go audio content.
☆671Updated 3 weeks ago