Automated image & video captioning using Qwen-VL, Gemma4 and SAM3.
☆62Apr 14, 2026Updated this week
Alternatives and similar repositories for VisionCaptioner
Users that are interested in VisionCaptioner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A sample Nodejs application to demonstrate Inutit Payment API's ( Support for both OAuth1.0 and OAuth2.0 )☆10Jul 30, 2021Updated 4 years ago
- Fast and flexible keyboard layout analyzer and optimizer☆18Dec 10, 2025Updated 4 months ago
- ☆132Updated this week
- An extension to use Kokoro TTS in text generation webui☆22May 5, 2025Updated 11 months ago
- A practical collection of plug-and-play starter blueprints for building autonomous AI apps with real APIs. Discover ready-to-build agent …☆83Apr 7, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Standalone Anima Lora trainer with GUI☆55Apr 7, 2026Updated last week
- Uses SAM3 to transfer 2D semantics onto 3D point clouds, producing segmented training data and efficient scene graphs for indoor environm…☆78Jan 23, 2026Updated 2 months ago
- A software to automatically tag images. It's primary use is for training Stable Diffusion checkpoints and loras.☆24Dec 4, 2025Updated 4 months ago
- Advanced drum machine for ComfyUI featuring a 64-step sequencer, custom sample support, and retro hardware aesthetics.☆20Jan 19, 2026Updated 2 months ago
- Doom for Gear VR☆19Jun 4, 2019Updated 6 years ago
- The official implementation of “MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction”☆55Mar 20, 2026Updated 3 weeks ago
- android_device_moto_wingray☆11May 11, 2016Updated 9 years ago
- Open-source WhatsApp bot with AI capabilities - Logan by YUV.AI☆66Feb 13, 2026Updated 2 months ago
- Deploy your own private OpenAI-compatible LLM☆27Jun 5, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ComfyUI nodes for Wan 2.2 SVI 2 Pro with Keyframe control via First/Last Frame and seamless video stitching.☆57Mar 31, 2026Updated 2 weeks ago
- A repo for the Pipecat + Gemini Workshop at the AI Engineer World's Fair☆37Jun 3, 2025Updated 10 months ago
- BOF for Havoc that copies locked Windows files (SAM, SYSTEM, NTDS.dit) via raw MFT parsing — no VSS, no Registry APIs, no PowerShell☆118Apr 6, 2026Updated last week
- A Windows executable to generate MilkVR ".mvrl" files for a collection of videos on your local PC, allowing easy access to those videos f…☆12Aug 24, 2017Updated 8 years ago
- An Open Source Machine Learning Framework for Everyone☆11Jun 2, 2019Updated 6 years ago
- OpenCap Monocular: 3D Human Kinematics and Dynamics From a Single Smartphone Video☆116Updated this week
- ☆14Jun 25, 2022Updated 3 years ago
- ComfyUI custom node for generating prompts from images. Supports Qwen2.5 and Qwen3 (Instruct/Thinking) models, as well as the OpenAI API.☆24Jan 10, 2026Updated 3 months ago
- A native Application , With Agentic Support (MCP) for ultra fast AI image generation using a highly optimized Z-Image-Turbo model with SD…☆31Feb 3, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- AllJoyn Device Service Bridge for Philips Hue☆11Jan 20, 2016Updated 10 years ago
- Mocap Project☆20Dec 13, 2012Updated 13 years ago
- Small, regex-based URL parsing function for use in in browser-based JS☆27Mar 5, 2018Updated 8 years ago
- Inference of MiniCPM-o 2.6 in plain C/C++☆34Oct 14, 2025Updated 6 months ago
- Control Virt-A-Mate (NSFW) commands with the keyboard or gamepads☆14Sep 21, 2020Updated 5 years ago
- Webui for editing/managing LoRA datasets☆29Mar 14, 2024Updated 2 years ago
- Oculus Rift support for Game Maker☆16Jun 16, 2018Updated 7 years ago
- ☆12Nov 24, 2018Updated 7 years ago
- Implementations of camera2 api☆18Sep 10, 2018Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- "Make-A-Video", new SOTA text to video by Meta-FAIR - Tensorflow☆14Oct 22, 2022Updated 3 years ago
- Table top manipulation calibration between the robot arm, the fixed cameras and the camera in hand.☆11Apr 12, 2024Updated 2 years ago
- An organizing tool for custom morphs in Virt-a-Mate☆10Dec 8, 2022Updated 3 years ago
- Pathtracing☆14Jan 7, 2025Updated last year
- BiRefNet background removal for ComfyUI☆15Dec 19, 2025Updated 3 months ago
- Fast Fourier Transform Frontend☆13Dec 11, 2013Updated 12 years ago
- Wiki☆11Mar 11, 2020Updated 6 years ago