这是一个不基于任何框架实现的从0到1的VLM finetune(包括Pre-train和SFT)
☆40Aug 22, 2025Updated 9 months ago
Alternatives and similar repositories for VLM-Finetuning
Users that are interested in VLM-Finetuning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple project used for Image Classification, including train and predict in Pytorch, do inference in Pytorch C++ API and TensorRT☆18Jun 15, 2020Updated 6 years ago
- ☆16Mar 24, 2025Updated last year
- Classify Traffic Signs.☆10Jan 31, 2017Updated 9 years ago
- official code for Dynamic Smooth Label Assignment☆12Oct 5, 2022Updated 3 years ago
- [ICML 2022 Spotlight] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks☆11May 21, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [CVPR 2025] LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs☆14Jun 20, 2025Updated 11 months ago
- The collections of MOE (Mixture Of Expert) papers, code and tools, etc.☆12Mar 15, 2024Updated 2 years ago
- 在index-tts-vllm的基础上,实现了并提供了模拟流式合成音频的接口服务及客户端测试脚本☆26Sep 2, 2025Updated 9 months ago
- Face++ 是一款基于 Android 平台开发的创新性 AI 面相分析应用。它巧妙地将中国传统面相学理论(如“三庭五眼”和“十二宫”)与现代人工智能技术相结合,为用户提供一份专业、详尽且富有洞察力的面相分析报告☆22Jul 14, 2025Updated 11 months ago
- Loop Clousure Detector☆13Feb 2, 2018Updated 8 years ago
- ☆34Apr 23, 2026Updated last month
- 私有化自动数字人排队训练、短视频排队生成的微信小程序、web运营后台管理系统一键部署,基于单人训练的音频驱动唇形,比wav2lip、deepfacelab、liveportrait、musetalk等等唇形方案更好,直接可以商业化,支持中日英韩多种语音复刻☆62May 10, 2026Updated last month
- The Official PyTorch implementation of "Part Aware Contrastive Learning for Self-Supervised Action Recognition" in IJCAI 2023☆13Nov 9, 2023Updated 2 years ago
- ☆18Jun 14, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Just a simple Android app that uses Rokid's CXR-M SDK to upload/sideload an APK onto your Rokid glasses over Wi-Fi. It might be hard to g…☆54Apr 9, 2026Updated 2 months ago
- Happy Hacking With Claude!!!☆25Oct 27, 2025Updated 7 months ago
- AI-Powered Research Desktop App☆64Apr 2, 2026Updated 2 months ago
- AI驱动的虚拟数字人直播系统,支持2D/3D数字人、TTS、ASR、唇形同步、推流、互动等模块化开发。☆25May 13, 2025Updated last year
- 3D_lut generate for surround view☆13Jul 31, 2019Updated 6 years ago
- [TOG 2024] BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation☆16Jun 14, 2024Updated 2 years ago
- 将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调☆590Sep 8, 2025Updated 9 months ago
- 一个开源的多模态 AI 搜索项目,结合 大语言模型(LLM)+ 多源搜索引擎 + 多 Agent 架构,打造新一代的智能问答式搜索体验☆17Mar 26, 2025Updated last year
- [ACM MM 2025] LIDAR: Lightweight Adaptive Cue-Aware Fusion Vision Mamba for Multimodal Segmentation of Structural Cracks☆23Nov 18, 2025Updated 6 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆44May 25, 2026Updated 3 weeks ago
- ☆12Sep 23, 2022Updated 3 years ago
- VesNet-RL: Simulation-based ReinforcementLearning for Real-World US Probe Navigation☆14Sep 27, 2023Updated 2 years ago
- HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model☆95Jul 17, 2025Updated 10 months ago
- [ACL2026 Findings] "Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models"☆20Mar 25, 2025Updated last year
- ocr with yolo3 as feature extractor, implemented by keras, and accelerated by tensorrt☆34Aug 7, 2020Updated 5 years ago
- This sample shows how to deploy an industrial computer vision model to detect real world analog pointer meters and extract corresponding …☆12Sep 23, 2022Updated 3 years ago
- Parallelize the serial implementation of 3D scene reconstruction with input from kinect sensor and run it on NvidiaGPU using CUDA.☆12Nov 2, 2016Updated 9 years ago
- Code for ISBI 2024 paper "Fully Differentiable Correlation-driven 2D/3D Registration for X-Ray to CT Image Fusion"☆10Aug 26, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Pyramid Attention Network for Medical Image Registration (ISBI 2024)☆16Feb 6, 2025Updated last year
- The improved model for multi-object detection and lane line segmentation based on the YoloP model.☆15Nov 5, 2022Updated 3 years ago
- Region growing for automatic spine segmentation☆11Apr 1, 2020Updated 6 years ago
- Official codebase for FACMIC: Federated Adaptative CLIP Model for Medical Image Classification (Accepted at MICCAI 2024)☆14Jun 21, 2024Updated last year
- Code to BraTS 2023 challenge.☆17May 5, 2025Updated last year
- A template for Tensorflow 2.0 + Keras projects☆12Mar 25, 2023Updated 3 years ago
- A modified version of Andrej Karpathy's build-nanogpt☆36Oct 26, 2025Updated 7 months ago