wjbmattingly / qwen2-vl-finetune-huggingfaceView external linksLinks
This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.
☆77Jul 14, 2025Updated 7 months ago
Alternatives and similar repositories for qwen2-vl-finetune-huggingface
Users that are interested in qwen2-vl-finetune-huggingface are comparing it to the libraries listed below
Sorting:
- ☆385Feb 8, 2025Updated last year
- The largest VQA dataset for Vietnamese. Related to the text content in the image.☆19Apr 9, 2025Updated 10 months ago
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Dec 6, 2022Updated 3 years ago
- OCR a IIIF images in a manifest and generate annotations☆26Feb 11, 2025Updated last year
- A repository to organize materials from the AI4LAM Teach and Learning Working Group☆14May 5, 2023Updated 2 years ago
- Codes for Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks☆12May 8, 2024Updated last year
- AI-powered browser extension to chat with any webpage☆10Aug 12, 2025Updated 6 months ago
- ☆16Jan 30, 2022Updated 4 years ago
- The official code of Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition (IJCAI2023)☆27Sep 3, 2023Updated 2 years ago
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆57Sep 15, 2025Updated 5 months ago
- ☆14May 26, 2023Updated 2 years ago
- Automated bash script to set up a high-performance environment on Ubuntu Linux with RTX5090, including installations of PyTorch, Unsloth,…☆18Apr 1, 2025Updated 10 months ago
- Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at SDU@AAAI-22☆14Aug 3, 2023Updated 2 years ago
- Awesome AI in Libraries☆17Jul 21, 2023Updated 2 years ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆15Jan 16, 2024Updated 2 years ago
- Library for converting from RGB / GrayScale image to base64 and back.☆19Sep 19, 2022Updated 3 years ago
- A study group for v4 of the fastai introduction to deep learning course with a focus on applications in GLAM settings☆15Oct 13, 2021Updated 4 years ago
- a single interface around speech-to-speech foundation models☆27Jun 27, 2025Updated 7 months ago
- nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset☆17Oct 16, 2024Updated last year
- Web-based tool to convert model into MyriadX blob☆16Dec 9, 2025Updated 2 months ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Nov 4, 2025Updated 3 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Feb 9, 2026Updated last week
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated 10 months ago
- A data validation tool for MARC records☆25Dec 19, 2025Updated last month
- Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.☆18Apr 23, 2023Updated 2 years ago
- Goldfish: Monolingual language models for 350 languages.☆23Aug 25, 2024Updated last year
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 2 years ago
- Tensorflow port implementation of Single Headed Attention RNN☆16Feb 1, 2020Updated 6 years ago
- IIIF Examples and useful code☆20Sep 10, 2025Updated 5 months ago
- NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement☆51Aug 5, 2024Updated last year
- Implementation of BitNet-1.58 instruct tuning☆27Apr 14, 2024Updated last year
- Load any clip model with a standardized interface☆22Oct 20, 2025Updated 3 months ago
- ☆11Jun 13, 2025Updated 8 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- The dataset and evaluation code for MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical found…☆24Nov 21, 2025Updated 2 months ago
- ☆19Oct 1, 2021Updated 4 years ago
- Vistral-V: Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.☆23Jul 1, 2024Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval☆38Aug 4, 2025Updated 6 months ago