YunxinLi / LingCloud
Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""
☆48Updated 9 months ago
Alternatives and similar repositories for LingCloud:
Users that are interested in LingCloud are comparing it to the libraries listed below
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- ☆35Updated last year
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning☆135Updated last year
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Updated 3 years ago
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆43Updated last year
- This repo contains codes and instructions for baselines in the VLUE benchmark.☆41Updated 2 years ago
- ☆17Updated last year
- Data for evaluating GPT-4V☆11Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆44Updated 10 months ago
- Official repository of MMDU dataset☆89Updated 6 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 5 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆22Updated 10 months ago
- Visual and Embodied Concepts evaluation benchmark☆21Updated last year
- my commonly-used tools☆52Updated 3 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆72Updated 5 months ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆24Updated last year
- ☆18Updated 9 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆46Updated 5 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- Touchstone: Evaluating Vision-Language Models by Language Models☆82Updated last year
- ☆63Updated last year
- Released code for our ICLR23 paper.☆64Updated 2 years ago
- ☆98Updated last year
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆49Updated 5 months ago
- ☆45Updated 7 months ago
- ☆45Updated last year
- Code for Findings of EMNLP2023 paper "Coarse-to-Fine Dual Encoders are Better Frame Identification Learners"☆12Updated last year
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Updated last year
- ☆28Updated last month