aryopg/mmlu-redux

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aryopg/mmlu-redux)

aryopg / mmlu-redux

☆32

Alternatives and similar repositories for mmlu-redux

Users that are interested in mmlu-redux are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yuzhaouoe / SAE-based-representation-engineering
View on GitHub
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆83Jun 20, 2026Updated last month
ArthurConmy / MishformerLens
View on GitHub
MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing…
☆10Oct 7, 2024Updated last year
AI-EDU-LAB / E-EVAL
View on GitHub
Official github repo for E-Eval, a Chinese K12 education evaluation benchmark for LLMs.
☆32Feb 19, 2024Updated 2 years ago
aryopg / decore
View on GitHub
Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"
☆30Dec 18, 2024Updated last year
facebookresearch / UNIREX
View on GitHub
This is the official PyTorch repo for "UNIREX: A Unified Learning Framework for Language Model Rationale Extraction" (ICML 2022).
☆28Feb 14, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Ali-Omrani / CCR
View on GitHub
Conceptual Construct Representations
☆11Feb 23, 2023Updated 3 years ago
marsggbo / EAGAN
View on GitHub
(ECCV2022) EAGAN: EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs
☆12Sep 15, 2022Updated 3 years ago
arman-aminian / network-anomaly-detection
View on GitHub
Rahnema Final Project - Network anomaly detection
☆11Jul 22, 2021Updated 4 years ago
nightdessert / Retrieval_Head
View on GitHub
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
☆241Aug 2, 2024Updated last year
amirhosseinNouri / meet-auto-admit
View on GitHub
Use this extension to automate google meet admission.
☆11Mar 1, 2021Updated 5 years ago
amirhallaji / Computational-Intelligence
View on GitHub
☆11Mar 12, 2021Updated 5 years ago
rsharifnasab / sbu_compiler
View on GitHub
compiler project for compiler course (spring 99) in sbu university
☆13Nov 21, 2023Updated 2 years ago
gautierdag / kblaunch
View on GitHub
CLI for fast launching jobs on a Kubernetes research cluster 🛸
☆15Jul 7, 2026Updated 2 weeks ago
chentong0 / copy-bench
View on GitHub
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
☆14Aug 19, 2025Updated 11 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
xjtuYW / PNP
View on GitHub
Beyond Known Clusters: Probe New Prototypes for Efficient Generalized Class Discovery
☆15Apr 28, 2024Updated 2 years ago
1995parham / shecan.sh
View on GitHub
Use shecan in bash with ease
☆15Feb 8, 2019Updated 7 years ago
curt-tigges / probity
View on GitHub
☆19Apr 10, 2025Updated last year
sail-sg / SkyLadder
View on GitHub
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆43Dec 29, 2025Updated 6 months ago
LivingFutureLab / ChineseSimpleQA
View on GitHub
☆79Jan 24, 2025Updated last year
yikee / Knowledge_Conflict
View on GitHub
Resolving Knowledge Conflicts in Large Language Models, COLM 2024
☆18Oct 7, 2025Updated 9 months ago
gunchagarg / differential-learning-rate-keras
View on GitHub
Implementation of Differential Learning Rate in Keras
☆11Jun 4, 2019Updated 7 years ago
TIGER-AI-Lab / MMLU-Pro
View on GitHub
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
☆412Mar 18, 2026Updated 4 months ago
1995parham / tile38-chart
View on GitHub
Helm chart for tile38
☆15Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
lasgroup / user_interactions
View on GitHub
Aligning Language Models from User Interactions via Self-Distillation
☆26Mar 31, 2026Updated 3 months ago
velocityCavalry / CREPE
View on GitHub
An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"
☆16Nov 5, 2024Updated last year
sssssyf / SSMLP-RPL
View on GitHub
Spectral-Spatial MLP Network with Reciprocal Points learning for Open-Set Hyperspectral Image Classification
☆16Jul 9, 2023Updated 3 years ago
fishiatee / Tumera
View on GitHub
Yet another frontend for LLM, written using .NET and WinUI 3
☆11Sep 14, 2025Updated 10 months ago
leolle / atec_nlp
View on GitHub
蚂蚁金融自然语言处理竞赛。
☆10Sep 3, 2018Updated 7 years ago
TransluceAI / introspective-interp
View on GitHub
Repository for "Training Language Models To Explain Their Own Computations"
☆23Jul 7, 2026Updated 2 weeks ago
arman-aminian / AIC21-Client-Python
View on GitHub
Sharif-AI-Challenge2021 Client
☆11Aug 20, 2021Updated 4 years ago
yikee / ScienceMeter
View on GitHub
ScienceMeter: Tracking Scientific Knowledge Updates in Language Models, COLM 2026
☆17Jun 28, 2025Updated last year
cs-chan / Fuzzy-Compression
View on GitHub
Caffe/Neon prototxt training file for our Neurocomputing2017 work: Fuzzy Quantitative Deep Compression Network
☆11May 30, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yogeshbalaji / Normalized-Wasserstein
View on GitHub
Normalized Wasserstein for Mixture Distributions
☆11Mar 24, 2023Updated 3 years ago
allenai / signal-and-noise
View on GitHub
Measuring the Signal to Noise Ratio in Language Model Evaluation
☆31Aug 19, 2025Updated 11 months ago
JamesClough / manifold_alignment
View on GitHub
A Python module for mapping multiple high-dimensional datasets into a common low-dimensional space.
☆10Mar 29, 2018Updated 8 years ago
nishiwen1214 / Benchmark-leakage-detection
View on GitHub
Official completion of “Training on the Benchmark Is Not All You Need”.
☆40Dec 31, 2024Updated last year
bugensui / WenTianSearch
View on GitHub
“阿里灵杰”问天引擎电商搜索算法赛 13/2771
☆10Jul 31, 2022Updated 3 years ago
zhaowei-wang-nlp / DivScene
View on GitHub
The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"
☆19May 2, 2025Updated last year
UKPLab / tmlr2026-manifold-analysis
View on GitHub
☆21Mar 3, 2026Updated 4 months ago