YunxinLi / LingCloudLinks

Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""

☆48

Alternatives and similar repositories for LingCloud

Users that are interested in LingCloud are comparing it to the libraries listed below

Sorting:

SihengLi99 / TextBind
[2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation
☆47Updated last year
HYPJUDY / Sparkles
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆44Updated last year
TobiasLee / VEC
Visual and Embodied Concepts evaluation benchmark
☆21Updated last year
M3-IT / YING-VLM
Vision Large Language Models trained on M3IT instruction tuning dataset
☆17Updated last year
MichaelZhouwang / VLUE
This repo contains codes and instructions for baselines in the VLUE benchmark.
☆41Updated 3 years ago
OFA-Sys / TouchStone
Touchstone: Evaluating Vision-Language Models by Language Models
☆83Updated last year
Victorwz / VaLM
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Updated 2 years ago
PLUM-Lab / MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆135Updated 2 years ago
shizhediao / DaVinci
Source code for the paper "Prefix Language Models are Unified Modal Learners"
☆43Updated 2 years ago
vlf-silkie / VLFeedback
☆100Updated last year
patrick-tssn / Awesome-Colorful-LLM
Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics,…
☆123Updated last month
ChenDelong1999 / polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
☆64Updated last year
CMMMU-Benchmark / CMMMU
☆47Updated 10 months ago
Yangyi-Chen / CoTConsistency
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
☆33Updated last year
xiangyu-mm / EasyGen
The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"
☆74Updated 8 months ago
guilk / KAT
Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"
☆65Updated 3 years ago
kugwzk / DiDE
Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”
☆30Updated 2 years ago
dvlab-research / Mr-Ben
This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"
☆50Updated 8 months ago
DAMO-NLP-SG / Auto-Arena-LLMs
☆39Updated 9 months ago
mlfoundations / VisIT-Bench
☆50Updated last year
thunlp / Muffin
☆65Updated last year
TIGER-AI-Lab / UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆155Updated 9 months ago
MAmmoTH-VL / MAmmoTH-VL
(ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
☆47Updated last month
BenfengXu / KNNPrompting
Released code for our ICLR23 paper.
☆65Updated 2 years ago
lancopku / DynamicKD
Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation for Pre-trained Language Models"
☆41Updated 2 years ago
OpenGVLab / Awesome-LLM4Tool
A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools
☆67Updated last year
open-vision-language / oven
☆39Updated last year
Yifan-Song793 / GoodBadGreedy
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
☆30Updated last year
limanling / KnowledgeVL-Reading
☆68Updated 2 years ago
JetRunner / SuperICL
Code for "Small Models are Valuable Plug-ins for Large Language Models"
☆130Updated 2 years ago