JShollaj / awesome-llm-interpretabilityLinks

A curated list of Large Language Model (LLM) Interpretability resources.

☆1,428

Alternatives and similar repositories for awesome-llm-interpretability

Users that are interested in awesome-llm-interpretability are comparing it to the libraries listed below

Sorting:

EdinburghNLP / awesome-hallucination-detection
List of papers on hallucination detection in LLMs.
☆974Updated this week
andyzoujm / representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
☆900Updated last year
stanfordnlp / pyreft
Stanford NLP Python library for Representation Finetuning (ReFT)
☆1,514Updated 8 months ago
facebookresearch / llm-transparency-tool
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. …
☆841Updated 10 months ago
lucidrains / self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
☆1,399Updated last year
wasiahmad / Awesome-LLM-Synthetic-Data
A reading list on LLM based Synthetic Data Generation 🔥
☆1,441Updated 4 months ago
stanfordnlp / pyvene
Stanford NLP Python library for understanding and improving PyTorch models via interventions
☆819Updated last week
uclaml / SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,206Updated last year
microsoft / promptbench
A unified evaluation framework for large language models
☆2,731Updated last week
ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models
This repository collects all relevant resources about interpretability in LLMs
☆375Updated 11 months ago
tjunlp-lab / Awesome-LLMs-Evaluation-Papers
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
☆785Updated last year
amitsangani / Llama
All the projects related to Llama
☆379Updated 6 months ago
predibase / llm_distillation_playbook
Best practices for distilling large language models.
☆578Updated last year
open-thought / system-2-research
System 2 Reasoning Link Collection
☆857Updated 7 months ago
jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
☆786Updated 5 months ago
maitrix-org / llm-reasoners
A library for advanced large language model reasoning
☆2,291Updated 4 months ago
prometheus-eval / prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 💯
☆1,005Updated 5 months ago
tatsu-lab / alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
☆1,877Updated 2 months ago
zjunlp / KnowledgeEditingPapers
Must-read Papers on Knowledge Editing for Large Language Models.
☆1,180Updated 3 months ago
zjunlp / EasyEdit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
☆2,591Updated last week
GaryYufei / AlignLLMHumanSurvey
Aligning Large Language Models with Human: A Survey
☆735Updated 2 years ago
TransformerLensOrg / TransformerLens
A library for mechanistic interpretability of GPT-style language models
☆2,684Updated this week
jbloomAus / SAELens
Training Sparse Autoencoders on Language Models
☆1,001Updated this week
huggingface / alignment-handbook
Robust recipes to align language models with human and AI preferences
☆5,398Updated last month
ashishpatel26 / LLM-Finetuning
LLM Finetuning with peft
☆2,676Updated 2 months ago
XueFuzhao / OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,614Updated last year
leobeeson / llm_benchmarks
A collection of benchmarks and datasets for evaluating LLM.
☆517Updated last year
srush / awesome-o1
A bibliography and survey of the papers surrounding o1
☆1,209Updated 11 months ago
ysymyth / awesome-language-agents
List of language agents based on paper "Cognitive Architectures for Language Agents"
☆1,047Updated 9 months ago
AnswerDotAI / fsdp_qlora
Training LLMs with QLoRA + FSDP
☆1,527Updated 11 months ago