BTifmmp / paper-piano
Paper Piano uses Python and OpenCV to detect key presses on a hand-drawn piano, translating them into digital notes and sound.
β39Updated 8 months ago
Alternatives and similar repositories for paper-piano:
Users that are interested in paper-piano are comparing it to the libraries listed below
- Inference and fine-tuning examples for vision models from π€ Transformersβ76Updated last week
- Eye explorationβ26Updated 2 months ago
- β39Updated 4 months ago
- Using the moondream VLM with optical flow for promptable object trackingβ53Updated 2 months ago
- An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.β24Updated last month
- Computer Vision and Machine Learning Jupyter Notebooks for Educational Purposesβ77Updated 4 months ago
- β111Updated 5 months ago
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.β69Updated this week
- A comprehensive guide to getting started with OpenCV-Python, including scripts and detailed documentation for each topic.β132Updated last month
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vβ¦β107Updated 7 months ago
- Mapping ping with a simple script and Ordinary Kriging to interpolate sparse measurements into a nice visualization!β79Updated 5 months ago
- β34Updated 5 months ago
- Each week I create sketches covering key Computer Vision concepts. If you want to learn more about CV stick around!β148Updated 2 years ago
- EyePy β webcam-based eye tracking made simpleβ105Updated last month
- This repo has the code of the 3 demos I presented at Google Gemma2 DevDay Tokyo, using Gemma2 on a Jetson Orin Nano device.β41Updated 2 weeks ago
- Machine learning library, Distributed training, Deep learning, Reinforcement learning, Models, TensorFlow, PyTorchβ60Updated this week
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β80Updated 10 months ago
- β204Updated 11 months ago
- Multimodal AI agent with Llama 3.2: A Streamlit app that processes text, images, PDFs, and PPTs, integrating NIM microservices, Milvus, aβ¦β110Updated 7 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-oβ53Updated 6 months ago
- Ultralytics Notebooks πβ72Updated last week
- Some helpers and examples for creating an LLM fine-tuning datasetβ70Updated last year
- A collection of sophisticated computer vision and machine learning problems for graduate-level researchers and practitionersβ32Updated this week
- From scratch implementation of a vision language model in pure PyTorchβ213Updated 11 months ago
- A new benchmark for measuring LLM's capability to detect bugs in large codebase.β30Updated 10 months ago
- β27Updated last year
- Tiny client for LLMs with vision and tool calling. As simple as it gets.β84Updated 3 months ago
- An automated tool for discovering insights from research papaer corporaβ138Updated 10 months ago
- This repo contains the four stages of app development related to the `How's My Eating?` appβ17Updated 2 months ago
- Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&Aβ32Updated this week