KwanWaiChung/M4LE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KwanWaiChung/M4LE)

KwanWaiChung / M4LE

Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models

☆23

Alternatives and similar repositories for M4LE

Users that are interested in M4LE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xinghaow99 / pbs-attn
View on GitHub
[ICML 2026] Sparser Block-Sparse Attention via Token Permutation
☆31May 22, 2026Updated 2 months ago
OpenLMLab / ParallelTokenizer
View on GitHub
Use the tokenizer in parallel to achieve superior acceleration
☆20Mar 21, 2024Updated 2 years ago
yhao-wang / LLM-Knowledge-Boundary
View on GitHub
Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"
☆21Jul 31, 2023Updated 2 years ago
sheryc / resonance_rope
View on GitHub
[ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.
☆24Mar 5, 2024Updated 2 years ago
OpenLMLab / scaling-rope
View on GitHub
code for Scaling Laws of RoPE-based Extrapolation
☆73Oct 16, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jungokasai / T2R
View on GitHub
☆14Nov 20, 2022Updated 3 years ago
zbchern / Neural-Relational-Topic-Models
View on GitHub
☆22May 13, 2019Updated 7 years ago
raunak-agarwal / instruction-datasets
View on GitHub
Datasets for Instruction Tuning of Large Language Models
☆261Nov 30, 2023Updated 2 years ago
THU-KEG / R-Eval
View on GitHub
[KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models
☆11Apr 9, 2024Updated 2 years ago
MiuLab / Spk-Dialogue
View on GitHub
Speaker Role Contextual Model for Dialogues
☆15Sep 30, 2017Updated 8 years ago
yujie-xing / Neural-Persona-based-Conversation-Model-Python-Version
View on GitHub
A PyTorch re-implementation of the persona-based neural conversation model proposed by Jiwei Li, Michel Galley, Chris Brockett, Georgios …
☆26Apr 30, 2020Updated 6 years ago
RUCBM / LeaF
View on GitHub
☆14Nov 2, 2025Updated 8 months ago
OpenLMLab / LongWanjuan
View on GitHub
Towards Systematic Measurement for Long Text Quality
☆39Sep 5, 2024Updated last year
RUCAIBox / BAMBOO
View on GitHub
☆36Mar 25, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
zhulishe / Quantitative-investment
View on GitHub
Use strategy in stock transaction for high revenue.
☆10Dec 24, 2015Updated 10 years ago
MiniMax-AI / mini-vela
View on GitHub
☆37Apr 2, 2026Updated 3 months ago
KwanWaiChung / MT-Eval
View on GitHub
Code and data for "MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models"
☆57Nov 18, 2025Updated 8 months ago
zwhong714 / weak-to-strong-preference-optimization
View on GitHub
[ICLR 2025 Spotlight] Weak-to-strong preference optimization: stealing reward from weak aligned model
☆18Feb 24, 2025Updated last year
hrwise-nlp / AppBench
View on GitHub
This is for EMNLP 2024 Paper: AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction
☆16Nov 4, 2024Updated last year
pdufter / staticlama
View on GitHub
☆13Apr 16, 2021Updated 5 years ago
bigai-nlco / LooGLE
View on GitHub
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
☆199Oct 8, 2024Updated last year
InternLM / InternLM-WQX
View on GitHub
☆19Jul 5, 2024Updated 2 years ago
Zhaoyi-Li21 / creme
View on GitHub
[ACL 2024] "Understanding and Patching Compositional Reasoning in LLMs"
☆14Aug 28, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
shizhediao / Post-Training-Data-Flywheel
View on GitHub
We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.
☆66Oct 3, 2024Updated last year
ruz048 / AutoLoRA
View on GitHub
☆10Apr 16, 2024Updated 2 years ago
zhaoxlpku / DynaAct
View on GitHub
☆15Nov 12, 2025Updated 8 months ago
facebookresearch / ToolVerifier
View on GitHub
This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.
☆23Mar 11, 2024Updated 2 years ago
txsun1997 / nlp-paradigm-shift
View on GitHub
Paradigm shift in natural language processing
☆42May 29, 2022Updated 4 years ago
RuishanLiu / GAN-TSC
View on GitHub
☆11Oct 15, 2020Updated 5 years ago
OpenMOSS / Thus-Spake-Long-Context-LLM
View on GitHub
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
☆62Mar 31, 2025Updated last year
norakassner / LAMA_primed_negated
View on GitHub
☆14Sep 17, 2020Updated 5 years ago
YJiangcm / BMC
View on GitHub
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
☆12Jan 26, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
tengxiaoliu / LM_skip
View on GitHub
[NeurIPS 2024] Can Language Models Learn to Skip Steps?
☆21Jan 25, 2025Updated last year
wbopan / Awesome-EToDs-Survey
View on GitHub
Collection of papers, benchmarks and newest trends in the domain of End-to-end ToDs
☆14Nov 18, 2023Updated 2 years ago
ay27 / RandomGit
View on GitHub
随机扒取古诗文词语作为git的commit msg
☆11Jan 16, 2017Updated 9 years ago
GeeeekExplorer / kkbot
View on GitHub
A Feishu/Lark AI agent bot
☆15Feb 27, 2026Updated 5 months ago
open-event-hub / title2event_baselines
View on GitHub
[EMNLP'22] Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset
☆20Apr 4, 2023Updated 3 years ago
jiahao42 / Simplified-Zhihu-Daily
View on GitHub
Android app for Zhihu Daily
☆15May 28, 2017Updated 9 years ago
horizon-llm / Think-RM
View on GitHub
[NeurIPS 2025] Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
☆17Nov 2, 2025Updated 8 months ago