2U1 / Gemma3-FinetuneLinks
An open-source implementaion for Gemma3 series by Google.
☆69Updated 2 weeks ago
Alternatives and similar repositories for Gemma3-Finetune
Users that are interested in Gemma3-Finetune are comparing it to the libraries listed below
Sorting:
- An open-source implementaion for fine-tuning SmolVLM.☆62Updated 4 months ago
- An open-source implementaion for fine-tuning Llama3.2-Vision series by Meta.☆174Updated 3 months ago
- An open-source implementaion for fine-tuning Molmo-7B-D and Molmo-7B-O by allenai.☆62Updated 9 months ago
- [Fully open] [Encoder-free MLLM] Vision as LoRA☆378Updated 7 months ago
- ☆124Updated last year
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆170Updated last year
- [ACL 2025 🔥] Rethinking Step-by-step Visual Reasoning in LLMs☆310Updated 8 months ago
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts☆162Updated last year
- A Simple Framework of Small-scale LMMs for Video Understanding☆108Updated 7 months ago
- [NeurIPS 2024] Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to im…☆116Updated last year
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆69Updated 9 months ago
- OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe☆141Updated last month
- ☆107Updated 7 months ago
- An open-source implementaion for fine-tuning Phi3-Vision and Phi3.5-Vision by Microsoft.☆98Updated 4 months ago
- [EMNLP 2024] Official PyTorch implementation code for realizing the technical part of Traversal of Layers (TroL) presenting new propagati…☆99Updated last year
- Reproduction of LLaVA-v1.5 based on Llama-3-8b LLM backbone.☆65Updated last year
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆147Updated last year
- [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆190Updated 10 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆65Updated last year
- Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding☆210Updated 3 months ago
- [ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning☆158Updated 5 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆129Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.