princeton-nlp / PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073
☆27Updated 8 months ago
Alternatives and similar repositories for PTP:
Users that are interested in PTP are comparing it to the libraries listed below
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆22Updated 3 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆49Updated 4 months ago
- The code and data for the paper JiuZhang3.0☆41Updated 9 months ago
- ☆59Updated 6 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆44Updated 2 months ago
- ☆29Updated last year
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆47Updated 4 months ago
- Code and data for the ACL 2024 Findings paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning"☆24Updated 9 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆27Updated 11 months ago
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆15Updated 4 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆62Updated 3 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆15Updated last week
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆47Updated this week
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆24Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆46Updated last year
- Codebase for Instruction Following without Instruction Tuning☆33Updated 5 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆36Updated 11 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆43Updated 2 weeks ago
- A curated list of resources about long-context in large-language models and video understanding.☆30Updated last year
- ☆15Updated 7 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆70Updated last year
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆38Updated last year
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆41Updated 4 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated 2 weeks ago
- ☆34Updated 11 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆23Updated 5 months ago