naver-ai / elva
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, EMNLP 2024 (to appear)
☆13Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for elva
- ☆32Updated 5 months ago
- ☆24Updated last year
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆42Updated 2 months ago
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆11Updated 4 months ago
- Google's Conceptual Captions Dataset translated into Korean☆21Updated 2 years ago
- ☆45Updated 6 months ago
- [ICLR 2023] RC-MAE☆51Updated 10 months ago
- Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023☆103Updated last year
- ☆23Updated 11 months ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆32Updated 2 months ago
- Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)☆56Updated 8 months ago
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Updated last year
- ☆9Updated last month
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆56Updated last month
- Don't Judge a Language Model by Its Last Layer: Contrastive Learning with Layer-Wise Attention Pooling☆9Updated 2 years ago
- Fine-grained Retrieval Prompt Tuning☆2Updated 3 months ago
- Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023☆43Updated 5 months ago
- For the rlhf learning environment of Koreans☆23Updated last year
- read 1 paper everyday (only weekday)☆53Updated 3 years ago
- ☆13Updated 2 years ago
- 📸 Code and Dataset for our ACL 2023 paper: "MPCHAT: Towards Multimodal Persona-Grounded Conversation"☆21Updated last year
- Polyglot을 활용한 image-text multimodal☆11Updated last year
- This is an official implementation of GRIT-VLP☆21Updated 2 years ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆27Updated 2 months ago
- ☆38Updated last year
- ☆25Updated last year
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆45Updated last month
- ☆26Updated last year
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Kn…☆14Updated this week