yash-srivastava19 / arrakisLinks
Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
☆29Updated 2 months ago
Alternatives and similar repositories for arrakis
Users that are interested in arrakis are comparing it to the libraries listed below
Sorting:
- Simple repository for training small reasoning models☆33Updated 4 months ago
- ☆134Updated 2 months ago
- PyTorch implementation for MRL☆18Updated last year
- Attribution-based Parameter Decomposition☆25Updated 2 weeks ago
- Official repo for Learning to Reason for Long-Form Story Generation☆63Updated 2 months ago
- ☆51Updated 7 months ago
- An introduction to LLM Sampling☆78Updated 6 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Compiling useful links, papers, benchmarks, ideas, etc.☆46Updated 3 months ago
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆18Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆58Updated last month
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆43Updated 6 months ago
- ☆14Updated last year
- LLM attention pattern visualizer☆10Updated last year
- Prune transformer layers☆69Updated last year
- ☆60Updated 3 weeks ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆55Updated 7 months ago
- ☆22Updated last year
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- LLM training in simple, raw C/CUDA☆14Updated 6 months ago
- ☆23Updated last year
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆27Updated last year
- Experiments with representation engineering☆11Updated last year
- ☆60Updated last week
- Engine for collecting, uploading, and downloading model activations☆18Updated 2 months ago
- ☆28Updated last year
- ☆47Updated 4 months ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated 2 months ago