这是一个不基于任何框架实现的从0到1的VLM finetune(包括Pre-train和SFT)
☆37Aug 22, 2025Updated 7 months ago
Alternatives and similar repositories for VLM-Finetuning
Users that are interested in VLM-Finetuning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Mar 24, 2025Updated 11 months ago
- Classify Traffic Signs.☆10Jan 31, 2017Updated 9 years ago
- 海思设备上部署阉割版yolov5☆13Nov 22, 2021Updated 4 years ago
- Official implementation of TransNetR: Transformer-based Residual Network for Polyp Segmentation with Multi-Center Out-of-Distribution Tes…☆24Feb 23, 2024Updated 2 years ago
- official code for Dynamic Smooth Label Assignment☆11Oct 5, 2022Updated 3 years ago
- Code for "RSF: Optimizing Rigid Scene Flow From 3D Point Clouds Without Labels"☆10Jan 17, 2023Updated 3 years ago
- [ICML 2022 Spotlight] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks☆11May 21, 2023Updated 2 years ago
- ☆46Nov 12, 2025Updated 4 months ago
- 私有化自动数字人排队训练、短视频排队生成的微信小程序、web运营后台管理系统一键部署,基于单人训练的音频驱动唇形,比wav2lip、deepfacelab、liveportrait、musetalk等等唇形方案更好,直接可以商业化,支持中日英韩多种语音复刻☆57Apr 14, 2025Updated 11 months ago
- Face++ 是一款基于 Android 平台开发的创新性 AI 面相分析应用。它巧妙地将中国传统面相学理论(如“三庭五眼”和“十二宫”)与现代人工智能技术相结合,为用户提供一份专业、详尽且富有洞察力的面相分析报告☆22Jul 14, 2025Updated 8 months ago
- ☆12May 19, 2024Updated last year
- ☆12May 19, 2020Updated 5 years ago
- ☆33Dec 17, 2025Updated 3 months ago
- ☆18Jun 14, 2025Updated 9 months ago
- The Official PyTorch implementation of "Part Aware Contrastive Learning for Self-Supervised Action Recognition" in IJCAI 2023☆13Nov 9, 2023Updated 2 years ago
- This sample shows how to use the oneAPI Video Processing Library (oneVPL) to perform a single and multi-source video decode and preproces…☆15Jun 15, 2023Updated 2 years ago
- 增加了indextts2的简单的界面与api调用方式☆26Oct 27, 2025Updated 4 months ago
- 分类任务的 Focal Loss,PyTorch 实现☆11Jun 13, 2023Updated 2 years ago
- ppt转数字人后台☆19Apr 9, 2025Updated 11 months ago
- ☆10Nov 12, 2020Updated 5 years ago
- 将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调☆559Sep 8, 2025Updated 6 months ago
- Object-Region Video Transformers☆24Mar 24, 2022Updated 3 years ago
- 一个开源的多模态 AI 搜索项目,结合 大语言模型(LLM)+ 多源搜索引擎 + 多 Agent 架构,打造新一代的智能问答式搜索体验☆14Mar 26, 2025Updated 11 months ago
- Multi-modal 3D ultrasound and CT in image-guided spinal surgery: public database and new registration algorithms☆13Mar 9, 2023Updated 3 years ago
- [ACM MM 2025] LIDAR: Lightweight Adaptive Cue-Aware Fusion Vision Mamba for Multimodal Segmentation of Structural Cracks☆22Nov 18, 2025Updated 4 months ago
- HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model☆91Jul 17, 2025Updated 8 months ago
- ☆17Dec 1, 2023Updated 2 years ago
- ☆12Sep 23, 2022Updated 3 years ago
- ☆10Mar 1, 2021Updated 5 years ago
- 自动生成短视频,文章自动成片,多模态混剪,数字人,声音克隆☆13Jun 25, 2024Updated last year
- Parallelize the serial implementation of 3D scene reconstruction with input from kinect sensor and run it on NvidiaGPU using CUDA.☆12Nov 2, 2016Updated 9 years ago
- Visual SLAM from RGB-D data using Microsoft Kinect☆10May 13, 2016Updated 9 years ago
- This is a project to fuse GPS/IMU/Wheel odometry for the vehicle localization.☆14Aug 19, 2020Updated 5 years ago
- Arxiv automatically obtains the latest article service.☆11Apr 29, 2020Updated 5 years ago
- Common template for pytorch project. Easy to extent and modify for new project.☆13Dec 13, 2022Updated 3 years ago
- Dual Quaternion implementation in python.☆11Nov 30, 2016Updated 9 years ago
- Pyramid Attention Network for Medical Image Registration (ISBI 2024)☆16Feb 6, 2025Updated last year
- ☆60Jun 8, 2025Updated 9 months ago
- 参考u2net自定义dataset和训练代码训练自己的数据集(基础班本)☆12Apr 20, 2022Updated 3 years ago