RUCBM / ICLEvalLinks
☆14Updated last year
Alternatives and similar repositories for ICLEval
Users that are interested in ICLEval are comparing it to the libraries listed below
Sorting:
- Long Context Extension and Generalization in LLMs☆62Updated last year
 - Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆103Updated 3 weeks ago
 - [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆52Updated last year
 - The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink…☆101Updated last month
 - ☆34Updated last year
 - [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆117Updated 10 months ago
 - A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆65Updated 8 months ago
 - Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆142Updated last year
 - Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"☆47Updated 3 months ago
 - [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆78Updated 11 months ago
 - [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆55Updated 8 months ago
 - Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆109Updated 8 months ago
 - Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆67Updated 6 months ago
 - A curated list of awesome resources dedicated to Scaling Laws for LLMs☆79Updated 2 years ago
 - The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆79Updated 9 months ago
 - ☆120Updated 4 months ago
 - Use the tokenizer in parallel to achieve superior acceleration☆20Updated last year
 - Replicating O1 inference-time scaling laws☆90Updated 11 months ago
 - [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆114Updated 5 months ago
 - Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆46Updated 4 months ago
 - [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆88Updated last month
 - The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆40Updated last year
 - [NeurIPS'24 Spotlight] Observational Scaling Laws☆57Updated last year
 - ☆48Updated last year
 - Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆31Updated last year
 - ☆195Updated 6 months ago
 - ☆47Updated 2 months ago
 - Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆174Updated 5 months ago
 - "Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…☆30Updated last year
 - Awesome Triton Resources☆36Updated 6 months ago