mynameischaos/Lion

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mynameischaos/Lion)

mynameischaos / Lion

Lion: Kindling Vision Intelligence within Large Language Models

☆51

Alternatives and similar repositories for Lion

Users that are interested in Lion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ChenAnno / Real20M_ACMMM2023
View on GitHub
Official implementation for "Real20M: A Large-scale E-commerce Dataset for Cross-domain Retrieval"
☆25Oct 27, 2025Updated 9 months ago
scenarios / WeMM
View on GitHub
☆90Jul 4, 2024Updated 2 years ago
PCIResearch / TransCore-M
View on GitHub
Large Multimodal Model
☆15Apr 8, 2024Updated 2 years ago
HaozheZhao / MIC
View on GitHub
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆361Dec 18, 2023Updated 2 years ago
buptlihang / CVLM
View on GitHub
☆23Jan 8, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lizhaoliu-Lec / CG-VLM
View on GitHub
This is the official repo for Contrastive Vision-Language Alignment Makes Efficient Instruction Learner.
☆20Dec 1, 2023Updated 2 years ago
huggingface / m4-logs
View on GitHub
M4 experiment logbook
☆59Aug 21, 2023Updated 2 years ago
kyegomez / PALI3
View on GitHub
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
☆147Updated this week
JiuTian-VL / JiuTian-LION
View on GitHub
[CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
☆154Sep 3, 2025Updated 10 months ago
MonolithFoundation / Bumblebee
View on GitHub
A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.
☆38Sep 9, 2024Updated last year
YuchenLiu98 / COMM
View on GitHub
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
☆211Jan 8, 2025Updated last year
FuxiaoLiu / MMC
View on GitHub
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
☆95Jan 7, 2025Updated last year
Tencent-QQMM / PureMM
View on GitHub
☆21Feb 29, 2024Updated 2 years ago
yiren-jian / BLIText
View on GitHub
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
☆26Dec 5, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
TempleX98 / MoVA
View on GitHub
[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context
☆174Sep 25, 2024Updated last year
BAAI-DCAI / DataOptim
View on GitHub
A collection of visual instruction tuning datasets.
☆77Mar 14, 2024Updated 2 years ago
mightyzau / RegionBLIP
View on GitHub
☆59Aug 7, 2023Updated 2 years ago
VQAssessment / BVQI
View on GitHub
[ICME 2023 Oral, Extended to TIP (UR)] The best zero-shot VQA approach that even outperforms several fully-supervised methods.
☆41Jul 11, 2023Updated 3 years ago
lucasjinreal / wnnx_models
View on GitHub
Various test models in WNNX format. It can view with `pip install wnetron && wnetron`
☆12Jun 22, 2022Updated 4 years ago
bytedance / lynx-llm
View on GitHub
paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/
☆272Aug 9, 2023Updated 2 years ago
vl-illusion / GVIL
View on GitHub
Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"
☆15Jan 25, 2024Updated 2 years ago
shikras / shikra
View on GitHub
☆814Jul 8, 2024Updated 2 years ago
shilinyan99 / PanoVOS
View on GitHub
「ECCV 2024」 PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation
☆21Jul 2, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mlpc-ucsd / BLIVA
View on GitHub
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
☆261Apr 14, 2024Updated 2 years ago
opendatalab / image-downloader
View on GitHub
☆31May 13, 2024Updated 2 years ago
luogen1996 / LLaVA-HR
View on GitHub
[ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant
☆249Aug 14, 2024Updated last year
InternScience / SimChart9K
View on GitHub
The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.
☆26Feb 22, 2024Updated 2 years ago
OpenGVLab / Multi-Modality-Arena
View on GitHub
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…
☆566Apr 21, 2024Updated 2 years ago
GitHubOfHyl97 / SkeAttnCLR
View on GitHub
The Official PyTorch implementation of "Part Aware Contrastive Learning for Self-Supervised Action Recognition" in IJCAI 2023
☆13Nov 9, 2023Updated 2 years ago
Dedsec-Xu / DatasetImgLabel-ICDAR2015
View on GitHub
DatasetImgLabeler is a image annotation tool for researchers to prepare datasets in ICDAR2015 format
☆12Dec 7, 2019Updated 6 years ago
kyegomez / Qwen-VL
View on GitHub
My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…
☆13Jan 29, 2024Updated 2 years ago
OpenGVLab / all-seeing
View on GitHub
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of …
☆507Aug 9, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
yangbang18 / MultiCapCLIP
View on GitHub
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆36Aug 8, 2024Updated last year
thunlp / CPT
View on GitHub
Colorful Prompt Tuning for Pre-trained Vision-Language Models
☆49Nov 1, 2022Updated 3 years ago
Q-Future / Q-Ground
View on GitHub
Official codes for "Q-Ground: Image Quality Grounding with Large Multi-modality Models", ACM MM2024 (Oral)
☆49Apr 21, 2026Updated 3 months ago
2bgm / KIE-HVQA
View on GitHub
☆13Jun 10, 2025Updated last year
yeliudev / nncore
View on GitHub
📦 A lightweight machine learning toolkit for researchers, providing common model design & learning functionalities.
☆29Jul 9, 2026Updated 2 weeks ago
alibaba / conv-llava
View on GitHub
☆128Jul 29, 2024Updated 2 years ago
Q-Future / Q-Bench
View on GitHub
①[ICLR2024 Spotlight] (GPT-4V/Gemini-Pro/Qwen-VL-Plus+16 OS MLLMs) A benchmark for multi-modality LLMs (MLLMs) on low-level vision and vi…
☆287Aug 12, 2024Updated last year