IAAR-Shanghai / Awesome-Attention-Heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
☆319Updated last week
Alternatives and similar repositories for Awesome-Attention-Heads:
Users that are interested in Awesome-Attention-Heads are comparing it to the libraries listed below
- The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"☆163Updated 4 months ago
- [ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation☆156Updated 2 weeks ago
- Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…☆161Updated 3 months ago
- All-in-one Web Agent framework for post-training. Start building with a few clicks!☆232Updated last month
- Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"☆144Updated 11 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆206Updated 4 months ago
- The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.☆301Updated 2 months ago
- Toolkit for Prompt Compression☆247Updated last month
- [ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts☆194Updated 4 months ago
- Controllable Text Generation for Large Language Models: A Survey☆162Updated 6 months ago
- [ACL 2024] RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback.☆188Updated 6 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆178Updated 7 months ago
- A recipe for online RLHF and online iterative DPO.☆494Updated 2 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆184Updated 10 months ago
- [ICLR 2025] Tool-Planner: Task Planning with Clusters across Multiple Tools☆104Updated last month
- ☆349Updated 4 months ago
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆91Updated 7 months ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆218Updated last year
- ☆143Updated 2 months ago
- The repo for In-context Autoencoder☆112Updated 10 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆405Updated 10 months ago
- FeatureAlignment = Alignment + Mechanistic Interpretability☆28Updated this week
- ☆253Updated last week
- ☆105Updated 7 months ago
- MoH: Multi-Head Attention as Mixture-of-Head Attention☆215Updated 4 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆226Updated 3 weeks ago