megagonlabs / holobenchLinks
ð«§ Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.; ICLR 2025)
â12Updated 4 months ago
Alternatives and similar repositories for holobench
Users that are interested in holobench are comparing it to the libraries listed below
Sorting:
- Common tools for data processingâ16Updated 3 months ago
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)â10Updated 5 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capabilityâ11Updated 4 months ago
- â12Updated 3 months ago
- â20Updated 2 months ago
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Surveyâ23Updated last month
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)â10Updated 7 months ago
- List of papers on Self-Correction of LLMs.â73Updated 6 months ago
- A library for evaluation of Grammatical Error Correction (GEC)â9Updated 2 months ago
- â12Updated 5 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.â16Updated 3 months ago
- SysBench: Can Large Language Models Follow System Messages?â31Updated 10 months ago
- Official repo of dataset-decomposition paper [NeurIPS 2024]â19Updated 6 months ago
- [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"â77Updated 7 months ago
- â24Updated last week
- Code for Benchmarking Language Model Agents for Data-Driven Scienceâ28Updated 8 months ago
- â15Updated 8 months ago
- LongHeads: Multi-Head Attention is Secretly a Long Context Processorâ29Updated last year
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Searchâ90Updated 7 months ago
- Long Context Extension and Generalization in LLMsâ57Updated 9 months ago
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DPâŠâ27Updated 7 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?â58Updated 4 months ago
- â12Updated 7 months ago
- the instructions and demonstrations for building a formal logical reasoning capable GLMâ53Updated 10 months ago
- â45Updated 11 months ago
- â46Updated 11 months ago
- Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"â23Updated 6 months ago
- â10Updated last week
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructionsâ45Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"â60Updated 5 months ago