naver-ai / elva
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, EMNLP 2024
☆15Updated last month
Alternatives and similar repositories for elva:
Users that are interested in elva are comparing it to the libraries listed below
- ☆36Updated 8 months ago
- ☆24Updated last year
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆43Updated 5 months ago
- Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023☆104Updated last year
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆11Updated 7 months ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆62Updated 4 months ago
- Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023☆44Updated 7 months ago
- Preference Learning for LLaVA☆35Updated 2 months ago
- ☆23Updated last year
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Updated last year
- ☆46Updated 9 months ago
- ☆38Updated last year
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆48Updated 3 months ago
- Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)☆56Updated 10 months ago
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆20Updated last month
- [ICLR 2023] RC-MAE☆51Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆11Updated last year
- Google's Conceptual Captions Dataset translated into Korean☆21Updated 2 years ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆38Updated 5 months ago
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Kn…☆17Updated last month
- [NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"☆23Updated 3 months ago
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆11Updated last month
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆12Updated 10 months ago
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆21Updated last month
- [AAAI2024] BOK-VQA : Bilingual Outside Knowledge-based Visual Question Answering via Graph Representation Pretraining☆1Updated 7 months ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆52Updated 5 months ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆35Updated 4 months ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆12Updated 10 months ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆44Updated last month