dtsip / in-context-learningLinks
☆238Updated last year
Alternatives and similar repositories for in-context-learning
Users that are interested in in-context-learning are comparing it to the libraries listed below
Sorting:
- ☆186Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆108Updated last year
- Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature☆161Updated 2 months ago
- ☆106Updated 7 months ago
- Using sparse coding to find distributed representations used by neural networks.☆269Updated last year
- ☆122Updated last year
- A fast, effective data attribution method for neural networks in PyTorch☆217Updated 9 months ago
- ☆168Updated 9 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆104Updated 2 years ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆77Updated 6 months ago
- AI Logging for Interpretability and Explainability🔬☆126Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆179Updated 4 months ago
- ☆97Updated last year
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆84Updated 3 months ago
- ☆99Updated last year
- ☆73Updated 3 years ago
- Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…☆18Updated 9 months ago
- [NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training☆35Updated 5 months ago
- ☆70Updated 9 months ago
- ☆83Updated 2 years ago
- ☆186Updated 2 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆128Updated 2 months ago
- An Open Source Implementation of Anthropic's Paper: "Towards Monosemanticity: Decomposing Language Models with Dictionary Learning"☆49Updated last year
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆75Updated 11 months ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆43Updated 10 months ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆53Updated 11 months ago
- Bayesian low-rank adaptation for large language models☆24Updated last year
- ☆240Updated 11 months ago
- ☆91Updated last year
- Sparse Autoencoder for Mechanistic Interpretability☆264Updated last year