BTifmmp / paper-piano
Paper Piano uses Python and OpenCV to detect key presses on a hand-drawn piano, translating them into digital notes and sound.
β38Updated 7 months ago
Alternatives and similar repositories for paper-piano:
Users that are interested in paper-piano are comparing it to the libraries listed below
- β107Updated 4 months ago
- Real-time pose estimation pipeline with π€ Transformersβ57Updated last month
- GPU Kernelsβ157Updated this week
- A collection of sophisticated computer vision and machine learning problems for graduate-level researchers and practitionersβ31Updated 3 weeks ago
- Mapping ping with a simple script and Ordinary Kriging to interpolate sparse measurements into a nice visualization!β80Updated 5 months ago
- Computer Vision and Machine Learning Jupyter Notebooks for Educational Purposesβ76Updated 3 months ago
- Eye explorationβ25Updated last month
- Inference and fine-tuning examples for vision models from π€ Transformersβ73Updated this week
- An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.β24Updated last month
- EyePy β webcam-based eye tracking made simpleβ102Updated last month
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"β27Updated this week
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vβ¦β106Updated 6 months ago
- β17Updated last week
- Each week I create sketches covering key Computer Vision concepts. If you want to learn more about CV stick around!β147Updated 2 years ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β80Updated 10 months ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)β97Updated last week
- Implementation snake game based on Diffusion modelβ88Updated 2 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementaβ¦β213Updated 3 months ago
- Multimodal AI agent with Llama 3.2: A Streamlit app that processes text, images, PDFs, and PPTs, integrating NIM microservices, Milvus, aβ¦