PAIR-code / pretraining-tda
☆10Updated last month
Alternatives and similar repositories for pretraining-tda:
Users that are interested in pretraining-tda are comparing it to the libraries listed below
- ☆72Updated 9 months ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆37Updated 2 years ago
- ☆31Updated last year
- Efficient Scaling laws and collaborative pretraining.☆13Updated this week
- Evaluation pipeline for the BabyLM Challenge 2023.☆75Updated last year
- ☆20Updated 4 months ago
- ☆45Updated 2 months ago
- ☆35Updated 2 years ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆66Updated 2 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 7 months ago
- Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.☆24Updated last year
- ☆27Updated 10 months ago
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- PyTorch building blocks for OLMo☆49Updated this week
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆15Updated last year
- Minimum Description Length probing for neural network representations☆18Updated this week
- ☆11Updated 7 months ago
- How do transformer LMs encode relations?☆46Updated 11 months ago
- AI Logging for Interpretability and Explainability🔬☆100Updated 7 months ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆21Updated 10 months ago
- ☆52Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆80Updated 2 months ago
- ☆38Updated 9 months ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Updated 3 weeks ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆51Updated last week
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆21Updated 5 months ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 7 months ago
- A library for efficient patching and automatic circuit discovery.☆48Updated 2 months ago
- ☆67Updated 5 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆42Updated last year