Understanding how features learned by neural networks evolve throughout training
☆41Oct 24, 2024Updated last year
Alternatives and similar repositories for features-across-time
Users that are interested in features-across-time are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28May 23, 2024Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Jun 19, 2024Updated last year
- Erasing concepts from neural representations with provable guarantees☆245Jan 27, 2025Updated last year
- Open source replication of Anthropic's Crosscoders for Model Diffing☆63Oct 27, 2024Updated last year
- ☆15Sep 21, 2022Updated 3 years ago
- ☆24Jan 28, 2025Updated last year
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆217Mar 16, 2026Updated last week
- A benchmark for mechanistic discovery of circuits in Transformers☆16Dec 15, 2024Updated last year
- ☆58Jun 15, 2023Updated 2 years ago
- Redwood Research's transformer interpretability tools☆15Apr 15, 2022Updated 3 years ago
- Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors☆11Nov 27, 2023Updated 2 years ago
- Pile Deduplication Code☆18May 15, 2023Updated 2 years ago
- ☆28Feb 27, 2025Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆13Feb 14, 2024Updated 2 years ago
- ☆19Nov 4, 2025Updated 4 months ago
- Code to perform stratified split of grouped datasets into train and validation sets using optimization☆18Oct 14, 2022Updated 3 years ago
- ofuton provides N-dimensional FFT.☆11Jan 1, 2018Updated 8 years ago
- Deep Probabilistic Koopman: long-term time-series forecasting under quasi-periodic uncertainty☆23Nov 3, 2021Updated 4 years ago
- A collection of different ways to implement accessing and modifying internal model activations for LLMs☆20Oct 18, 2024Updated last year
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆27Nov 3, 2025Updated 4 months ago
- Research Papers on Efficient Neural Fields from EffL Group☆16Apr 21, 2025Updated 11 months ago
- A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations☆209Dec 22, 2021Updated 4 years ago
- Code for our paper "Fixed-point Inversion for Text-to-image diffusion models"☆19Oct 13, 2024Updated last year
- Tools for understanding how transformer predictions are built layer-by-layer☆576Aug 7, 2025Updated 7 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆245Mar 16, 2026Updated last week
- Code used for the paper "Linguistic Features for Readability Assessment" (Deutsch, Jasbi, and Shieber 2020)☆25Jul 19, 2021Updated 4 years ago
- Python scripts to download course videos off CDEEP☆12Oct 20, 2015Updated 10 years ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Aug 5, 2025Updated 7 months ago
- Mapping out the "memory" of neural nets with data attribution☆49Updated this week
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆13Feb 13, 2023Updated 3 years ago
- ☆12Sep 26, 2019Updated 6 years ago
- The best RSS reader on Ubuntu☆81Jan 6, 2013Updated 13 years ago
- ☆131Aug 18, 2022Updated 3 years ago
- EyeClient is a browser widget for the EYE reasoner.☆13Oct 13, 2017Updated 8 years ago
- ☆157Dec 30, 2025Updated 2 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆137Mar 9, 2024Updated 2 years ago
- Notes on Direct Preference Optimization☆24Apr 14, 2024Updated last year
- ☆11Nov 27, 2019Updated 6 years ago
- ☆11Sep 10, 2023Updated 2 years ago