ALEX-nlp/MUI-Eval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ALEX-nlp/MUI-Eval)

ALEX-nlp / MUI-Eval

Repository for the paper: Revisiting LLM Evaluation through Mechanism Interpretability: a New Metric and Model Utility Law

☆13

Alternatives and similar repositories for MUI-Eval

Users that are interested in MUI-Eval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ALEX-nlp / DenoiseRL
View on GitHub
DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes
☆36Updated this week
lichengunc / vist_api
View on GitHub
Visual Storytelling API
☆37Feb 11, 2017Updated 9 years ago
OSH-2023 / osh-2023.github.io
View on GitHub
USTC OSH 2023 course homepage
☆13Jul 27, 2023Updated 3 years ago
THU-KEG / Skill-Neuron
View on GitHub
Source code for EMNLP2022 paper "Finding Skill Neurons in Pre-trained Transformers via Prompt Tuning".
☆18Mar 13, 2023Updated 3 years ago
wonderful9462 / IC-Former
View on GitHub
Code for "In-Context Former: Lightning-fast Compressing Context for Large Language Model" (Findings of EMNLP 2024)
☆21Nov 21, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
amudide / switch_sae
View on GitHub
Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)
☆25Dec 1, 2024Updated last year
TaoMiner / bridgeGap
View on GitHub
☆14Jul 9, 2018Updated 8 years ago
antct / bert-fine-grained-ner
View on GitHub
Fine-grained named entity recognition using BERT
☆11Feb 5, 2020Updated 6 years ago
Sixzeroo / HFUTXCNewsNotifications
View on GitHub
监控合肥工业大学宣城校区官网通知变化情况，并发送邮件进行通知
☆12Jun 1, 2021Updated 5 years ago
luhongchun / Eight-Numbers_and_Eight-Queens
View on GitHub
人工智能：爬山法、随机重启爬山法、模拟退火算法、遗传算法、启发式搜索方法解决八数码和八皇后问题
☆11Jul 15, 2021Updated 5 years ago
taoky / reccli
View on GitHub
A proof-of-concept rec.ustc.edu.cn client
☆15Dec 25, 2023Updated 2 years ago
YinBo0927 / FATE
View on GitHub
The official code of On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
☆25May 13, 2026Updated 2 months ago
wang-yongpan / BinEnhance
View on GitHub
The datasets and source code of the NDSS 2025 paper《BinEnhance: An Enhancement Framework Based on External Environment Semantics for Bina…
☆30Nov 13, 2025Updated 8 months ago
ARXroboticsX / ARX_PLAY
View on GitHub
☆17Aug 14, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
allenai / signal-and-noise
View on GitHub
Measuring the Signal to Noise Ratio in Language Model Evaluation
☆31Aug 19, 2025Updated 11 months ago
allenai / fluid-benchmarking
View on GitHub
Fluid Language Model Benchmarking
☆29Sep 16, 2025Updated 10 months ago
princeton-nlp / LLMBar
View on GitHub
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following
☆138Jul 8, 2024Updated 2 years ago
Zhiyuan-Zeng / EvalTree
View on GitHub
[COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees
☆31Jul 11, 2025Updated last year
zijunwa / ProPhy
View on GitHub
ProPhy: Progressive Physical Alignment for Dynamic World Simulation
☆22Apr 15, 2026Updated 3 months ago
nathanhu0 / CaMeLS
View on GitHub
Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.
☆26Jan 23, 2024Updated 2 years ago
Zoeyyao27 / Graph-of-Thought
View on GitHub
This repository contains the code for the paper: Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models
☆23Apr 27, 2024Updated 2 years ago
MM-Thinking / Metis-RISE
View on GitHub
Metis-RISE: RL Incentivizes and SFT Enhances Multimodal Reasoning Model Learning
☆22Jun 26, 2025Updated last year
TaoMiner / inferwiki
View on GitHub
☆21Aug 3, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
iamNCJ / Openwrt-NanoPi-R2S
View on GitHub
Self Tuned Openwrt for NanoPi R2S
☆11May 11, 2025Updated last year
SnowCharmQ / DPL
View on GitHub
[2025 ACL Findings] Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization
☆25Oct 29, 2025Updated 9 months ago
muhaochen / wikiHow_paper_list
View on GitHub
A paper list of research conducted based on wikiHow
☆27Mar 5, 2022Updated 4 years ago
lyufan / P2I-MI
View on GitHub
[ECCV 2024] "Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment"
☆15Mar 12, 2025Updated last year
QSCTech / jvm-cpp
View on GitHub
a basic jvm
☆12Jan 22, 2018Updated 8 years ago
Lanly109 / Solution-Markdown-Template-For-Algorithm-Contest
View on GitHub
Solution Markdown Template For Algorithm Contest
☆29Sep 22, 2024Updated last year
wellido / DeepGraph
View on GitHub
☆12Nov 30, 2018Updated 7 years ago
OSH-2020 / OSH-2020.github.io
View on GitHub
课程主页
☆33Jul 16, 2020Updated 6 years ago
seL4 / sel4test
View on GitHub
Test suite for seL4.
☆31Jul 22, 2026Updated last week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jenisys / parse_type
View on GitHub
parse_type extends the "parse" module (opposite of "string.format()")
☆20Aug 11, 2025Updated 11 months ago
OA256864 / MEL_Tweets
View on GitHub
Multimodal entity linking for Tweets
☆28Aug 30, 2021Updated 4 years ago
AnselCmy / FedE
View on GitHub
Source code for IJCKG 2021 paper "FedE: Embedding Knowledge Graphs in Federated Setting"
☆25Apr 15, 2022Updated 4 years ago
Cloud-Iris / Iris-Library
View on GitHub
学习过程中积累的一些笔记
☆34Mar 9, 2023Updated 3 years ago
microsoft / Hacksaw
View on GitHub
Hardware-centric Linux kernel debloater
☆15Nov 28, 2023Updated 2 years ago
THU-KEG / OpenSAE
View on GitHub
☆49Apr 12, 2026Updated 3 months ago
QSCTech / ZJUintl-gRPC
View on GitHub
gRPC service for Zhejiang University Intl Campus.
☆13Feb 17, 2019Updated 7 years ago