allenai / understanding_mcqaLinks
Code for the arXiv preprint "Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions"
☆16Updated 3 months ago
Alternatives and similar repositories for understanding_mcqa
Users that are interested in understanding_mcqa are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆159Updated this week
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- [COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees☆27Updated 4 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆141Updated 4 months ago
- ☆82Updated 9 months ago
- ☆94Updated last year
- Evaluating the Moral Beliefs Encoded in LLMs☆31Updated 11 months ago
- ☆103Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆112Updated 3 months ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆44Updated 8 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆192Updated 9 months ago
- This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…☆53Updated 11 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆67Updated 11 months ago
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆27Updated 8 months ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆46Updated 2 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆96Updated 2 years ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆107Updated 3 weeks ago
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆195Updated 5 months ago
- ☆30Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆89Updated last year
- ☆197Updated 7 months ago
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆40Updated last year
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆94Updated 6 months ago
- ☆74Updated last year
- ☆22Updated 11 months ago
- Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.☆163Updated this week
- augmented LLM with self reflection☆134Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"☆64Updated 9 months ago
- ☆53Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆37Updated last year