EdinburghNLP/MMLongBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/EdinburghNLP/MMLongBench)

EdinburghNLP / MMLongBench

The official repo of the paper "MMLongBench Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly"

☆175

Alternatives and similar repositories for MMLongBench

Users that are interested in MMLongBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhaowei-wang-nlp / DivScene
View on GitHub
The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"
☆19May 2, 2025Updated last year
Elvin-Yiming-Du / Memory-T1
View on GitHub
This respository is used for time reasoning task for mult-session dialogue system.
☆16Feb 7, 2026Updated 5 months ago
Thomasyyj / LongBio-Benchmark
View on GitHub
A controlled benchmark on evaluating and studying the dynamics of Long Context Language Models
☆26Oct 17, 2025Updated 9 months ago
HKUST-KnowComp / SubeventWriter
View on GitHub
Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…
☆11Oct 16, 2022Updated 3 years ago
HKUST-KnowComp / CAT
View on GitHub
Code for the ACL2023 paper: CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning (https://aclant…
☆11May 9, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
dengc2023 / LongDocURL
View on GitHub
☆41Apr 6, 2026Updated 3 months ago
mayubo2333 / MMLongBench-Doc
View on GitHub
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆149Sep 28, 2025Updated 9 months ago
WeiminXiong / RationaleCL
View on GitHub
Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)
☆12Oct 11, 2023Updated 2 years ago
alessiodevoto / l2compress
View on GitHub
Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."
☆19Dec 13, 2024Updated last year
yuzhaouoe / SAE-based-representation-engineering
View on GitHub
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆83Jun 20, 2026Updated last month
xyq7 / Human-Contribution-Measurement
View on GitHub
☆13Jun 4, 2025Updated last year
Yifan-Song793 / InfoCL
View on GitHub
Findings of EMNLP 2023: InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspe…
☆14Aug 13, 2024Updated last year
aryopg / decore
View on GitHub
Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"
☆30Dec 18, 2024Updated last year
RuiHuangNUS / MARS-Reconfig
View on GitHub
[ICRA 2025]Robust Self-Reconfiguration for Fault-Tolerant Control of Modular Aerial Robot Systems
☆28Jun 9, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ScarletPan / probase-concept
View on GitHub
A fast and neat API for Conceptualization of Probase
☆17Oct 28, 2019Updated 6 years ago
Yifan-Song793 / GoodBadGreedy
View on GitHub
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
☆31Jul 17, 2024Updated 2 years ago
xszheng2020 / memorization
View on GitHub
An Empirical Study of Memorization in NLP (ACL 2022)
☆13Jun 22, 2022Updated 4 years ago
csmliu / pretrained-GANs
View on GitHub
A Survey on Leveraging Pre-trained Generative Adversarial Networks for Image Editing and Restoration
☆17Jul 22, 2022Updated 4 years ago
princeton-nlp / continual-factoid-memorization
View on GitHub
Continual Memorization of Factoids in Large Language Models
☆12Nov 20, 2024Updated last year
JiayuJeff / CostBench
View on GitHub
The official code repository for the paper "CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments…
☆33Jun 14, 2026Updated last month
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆25Sep 26, 2024Updated last year
pkunlp-icler / PCA-EVAL
View on GitHub
[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
☆107Mar 14, 2024Updated 2 years ago
yuzhaouoe / pretraining-data-packing
View on GitHub
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆24Aug 18, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Timothyxxx / EnvInteractiveLMPapers
View on GitHub
Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…
☆128Jul 26, 2023Updated 2 years ago
chiefovoavicii / MAD-OPD
View on GitHub
Official code for "Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate" (arXiv:2605.01347).
☆31May 7, 2026Updated 2 months ago
WooooDyy / BMMR
View on GitHub
Code and resources for the NeurIPS 2025 Paper "BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset" by Zhiheng X…
☆18Oct 14, 2025Updated 9 months ago
microsoft / MMLMCalibration
View on GitHub
Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models
☆15Jun 12, 2023Updated 3 years ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
SumilerGAO / SunGen
View on GitHub
☆28Feb 26, 2023Updated 3 years ago
lemon-prog123 / LongRePS
View on GitHub
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
☆19Apr 1, 2025Updated last year
lzhangbv / acpsgd
View on GitHub
[ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
☆10Apr 28, 2023Updated 3 years ago
KbsdJames / Omni-MATH
View on GitHub
The official repository of the Omni-MATH benchmark.
☆94Dec 22, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
facebookresearch / lss_eval
View on GitHub
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Aug 25, 2023Updated 2 years ago
HKUST-KnowComp / ASER
View on GitHub
ASER (Activities, States, Events, and their Relations): a large-scale weighted eventuality knowledge graph.
☆309Apr 9, 2024Updated 2 years ago
Arvid-pku / Overleaf-Bib-Helper
View on GitHub
Enhances Overleaf by allowing article searches and BibTeX retrieval from DBLP and Google Scholar | 通过允许从 DBLP 和 Google Scholar 进行文章搜索和获取 …
☆46Apr 14, 2025Updated last year
zhaoyuzhi / ICM-Assistant
View on GitHub
ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025
☆16Aug 25, 2025Updated 10 months ago
TerryPei / CSP
View on GitHub
Cross-Self KV Cache Pruning for Efficient Vision-Language Inference
☆10Dec 15, 2024Updated last year
RenShuhuai-Andy / my-tools
View on GitHub
my commonly-used tools
☆64Jan 7, 2025Updated last year
namespace-Pt / UltraGist
View on GitHub
☆18Dec 2, 2024Updated last year