JShollaj / awesome-llm-interpretability
A curated list of Large Language Model (LLM) Interpretability resources.
☆1,205Updated 3 weeks ago
Alternatives and similar repositories for awesome-llm-interpretability:
Users that are interested in awesome-llm-interpretability are comparing it to the libraries listed below
- ReFT: Representation Finetuning for Language Models☆1,373Updated 2 weeks ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,358Updated 9 months ago
- LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. …☆789Updated last month
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,099Updated 8 months ago
- A unified evaluation framework for large language models☆2,505Updated 2 months ago
- Automatically evaluate your LLMs in Google Colab☆575Updated 8 months ago
- A bibliography and survey of the papers surrounding o1☆1,042Updated 2 months ago
- List of papers on hallucination detection in LLMs.☆734Updated 3 weeks ago
- Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions☆677Updated 2 weeks ago
- ☆2,289Updated this week
- Representation Engineering: A Top-Down Approach to AI Transparency☆775Updated 5 months ago
- The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.☆729Updated 8 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,147Updated last week
- A reading list on LLM based Synthetic Data Generation 🔥☆969Updated 2 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆970Updated this week
- Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...☆1,765Updated 2 weeks ago
- This repository collects all relevant resources about interpretability in LLMs