TristanThrush / perplexity-correlationsLinks

Simple and scalable tools for data-driven pretraining data selection.

☆29

Alternatives and similar repositories for perplexity-correlations

Users that are interested in perplexity-correlations are comparing it to the libraries listed below

Sorting:

ekinakyurek / influence
Code for "Tracing Knowledge in Language Models Back to the Training Data"
☆39Updated 2 years ago
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆99Updated 4 years ago
XiangLi1999 / AutoBencher
☆32Updated last year
PAIR-code / pretraining-tda
☆29Updated 9 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
explanare / ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆56Updated 2 weeks ago
princeton-nlp / LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
☆78Updated 2 years ago
aviclu / ffn-values
☆67Updated 2 years ago
hadasah / btm
☆76Updated last year
nkandpa2 / long_tail_knowledge
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆78Updated 2 years ago
guy-dar / embedding-space
☆55Updated 2 years ago
google / belief-localization
This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…
☆61Updated 2 years ago
allenai / bff
☆39Updated last year
abertsch72 / long-context-icl
Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"
☆40Updated last year
HazyResearch / skill-it
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆47Updated 2 years ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆80Updated 3 months ago
mcleish7 / gemstone-scaling-laws
Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)
☆29Updated last month
logix-project / logix
AI Logging for Interpretability and Explainability🔬
☆133Updated last year
angie-chen55 / pref-learning-ranking-acc
☆13Updated last year
jmerullo / lm_vector_arithmetic
☆36Updated 2 years ago
roeehendel / icl_task_vectors
☆101Updated 2 years ago
yuzhaouoe / pretraining-data-packing
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆22Updated last year
liujch1998 / memo-trap
☆22Updated 2 years ago
tau-nlp / scrolls
The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".
☆69Updated last year
kernelmachine / demix-data
Benchmark API for Multidomain Language Modeling
☆25Updated 3 years ago
aryamanarora / causalgym
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
☆49Updated 11 months ago
bloomberg / dataless-model-merging
Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)
☆91Updated 2 years ago
edenbiran / HoppingTooLate
Exploring the Limitations of Large Language Models on Multi-Hop Queries
☆27Updated 8 months ago
MaheepChaudhary / SAE-Ravel
Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…
☆12Updated 9 months ago
nouhadziri / faith-and-fate
☆37Updated last year