A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆98Dec 17, 2024Updated last year
Alternatives and similar repositories for Mini-LLaVA
Users that are interested in Mini-LLaVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A minimal re-implementation of orthogonal fine-tuning (OFT), a diffusion method, for LLMs. Based on nanoGPT and minLoRA.☆14Nov 17, 2023Updated 2 years ago
- Visualize any repo or codebase into diagram or animation☆23Oct 14, 2024Updated last year
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 10 months ago
- A collection of demos and utilities prepared ahead of the Vector Institute Privacy Enhancing Techniques (PETs) Bootcamp.☆15Sep 22, 2022Updated 3 years ago
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Repo of HawkLlama.☆16Jan 2, 2025Updated last year
- ☆11Feb 16, 2026Updated 2 months ago
- GRadient-INformed MoE☆264Sep 25, 2024Updated last year
- A tiny, didactical implementation of LLAMA 3☆42Dec 2, 2024Updated last year
- Directed masked autoencoders☆14Mar 25, 2026Updated 3 weeks ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Sep 25, 2024Updated last year
- [ICLR 2025] MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution☆330Jul 4, 2025Updated 9 months ago
- Unofficial Implementation of Selective Attention Transformer☆21Oct 31, 2024Updated last year
- A block pruning framework for LLMs.☆28May 17, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆133Nov 19, 2024Updated last year
- Repo for "Uncertain Multimodal Intention and Emotion Understanding in the Wild"☆17Oct 20, 2025Updated 5 months ago
- Run SOTA Vision-Language Model Florence-2 on your data!☆15Mar 27, 2025Updated last year
- Long Context Research☆31Jan 26, 2026Updated 2 months ago
- WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models☆23Jul 10, 2025Updated 9 months ago
- Code-Switched translations with Large Language models☆25Dec 17, 2024Updated last year
- Accompanying code for "Analyzing Vision Tranformers in Class Embedding Space" (NeurIPS '23)☆15Jun 10, 2024Updated last year
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆213Jan 6, 2025Updated last year
- An efficient and scalable attention module designed to reduce memory usage and improve inference speed in large language models. Designe…☆21Jun 25, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICCV 2023] Rethinking Point Cloud Registration as Masking and Reconstruction☆10Aug 14, 2023Updated 2 years ago
- ☆24Oct 28, 2024Updated last year
- Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!☆122Mar 4, 2025Updated last year
- A bot that scrapes your jobs in real time, sort them according to preferences and runs an alert☆16Nov 14, 2024Updated last year
- When do we not need larger vision models?☆418Feb 8, 2025Updated last year
- Official implementation for LaCo (EMNLP 2024 Findings)☆21Oct 3, 2024Updated last year
- ☆11Apr 27, 2013Updated 12 years ago
- Supercharge your Gaianet node by generating a vector knowledge base from any API. Demo slides: https://hackmd.io/@santteegt/ByoykY4nC#/ L…☆11Nov 29, 2024Updated last year
- MSPaint for marimo and other Python notebooks☆24Oct 24, 2025Updated 5 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A Framework of Small-scale Large Multimodal Models☆976Apr 11, 2026Updated last week
- ☆12Jan 17, 2024Updated 2 years ago
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago
- Implementation of ''VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation''☆15Sep 16, 2025Updated 7 months ago
- a plugin-oriented framework for video structured. 国产程序员请加微信zhzhi78拉群交流。☆18May 28, 2024Updated last year
- Unofficial Implementation of Evolutionary Model Merging☆41Mar 28, 2024Updated 2 years ago
- using pvanet framework train mobilenet-v2 for objects detection, papaer: https://arxiv.org/abs/1611.08588☆13Feb 13, 2019Updated 7 years ago