allenai / understanding_mcqaLinks

Code for the arXiv preprint "Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions"

☆16

Alternatives and similar repositories for understanding_mcqa

Users that are interested in understanding_mcqa are comparing it to the libraries listed below

Sorting:

zjunlp / KnowledgeCircuits
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
☆159Updated this week
GAIR-NLP / scaleeval
Scalable Meta-Evaluation of LLMs as Evaluators
☆42Updated last year
Zhiyuan-Zeng / EvalTree
[COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees
☆27Updated 4 months ago
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆141Updated 4 months ago
technion-cs-nlp / LLMsKnow
☆82Updated 9 months ago
saprmarks / geometry-of-truth
☆94Updated last year
ninodimontalcino / moralchoice
Evaluating the Moral Beliefs Encoded in LLMs
☆31Updated 11 months ago
ScalerLab / JudgeBench
☆103Updated last year
Yu-Fangxu / FoR
[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
☆112Updated 3 months ago
allenai / SciRIFF
Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.
☆44Updated 8 months ago
shengliu66 / ICV
Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
☆192Updated 9 months ago
tonychenxyz / selfie
This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…
☆53Updated 11 months ago
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆67Updated 11 months ago
edenbiran / HoppingTooLate
Exploring the Limitations of Large Language Models on Multi-Hop Queries
☆27Updated 8 months ago
zjunlp / unlearn
[ACL 2025] Knowledge Unlearning for Large Language Models
☆46Updated 2 months ago
meg-tong / sycophancy-eval
datasets from the paper "Towards Understanding Sycophancy in Language Models"
☆96Updated 2 years ago
SALT-NLP / collaborative-gym
Framework and toolkits for building and evaluating collaborative agents that can work together with humans.
☆107Updated 3 weeks ago
suzgunmirac / dynamic-cheatsheet
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory
☆195Updated 5 months ago
dxhou / CoAct
☆30Updated last year
Re-Align / just-eval
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
☆89Updated last year
da03 / Internalize_CoT_Step_by_Step
☆197Updated 7 months ago
abertsch72 / long-context-icl
Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"
☆40Updated last year
AlexCuadron / ThinkingAgent
Systematic evaluation framework that automatically rates overthinking behavior in large language models.
☆94Updated 6 months ago
abhika-m / FAVA
☆74Updated last year
technion-cs-nlp / hallucination-mitigation
☆22Updated 11 months ago
OpenMOSS / Language-Model-SAEs
Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.
☆163Updated this week
rxlqn / awesome-llm-self-reflection
augmented LLM with self reflection
☆134Updated last year
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆64Updated 9 months ago
marzenakrp / nocha
☆53Updated last year
dinobby / MAGDi
The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…
☆37Updated last year