AugustasMacijauskas / trailtoken
An application that visualises LLM tokenizers
☆10Updated 4 months ago
Alternatives and similar repositories for trailtoken:
Users that are interested in trailtoken are comparing it to the libraries listed below
- Tools for studying developmental interpretability in neural networks.☆83Updated last week
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing…☆11Updated 3 months ago
- The Happy Faces Benchmark☆14Updated last year
- ☆61Updated last year
- Erasing concepts from neural representations with provable guarantees☆221Updated this week
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆108Updated 2 years ago
- git extension for {collaborative, communal, continual} model development☆207Updated 2 months ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Updated 2 years ago
- Implementing RASP transformer programming language https://arxiv.org/pdf/2106.06981.pdf.☆45Updated 3 years ago
- ☆117Updated last year
- Fairness toolkit for pytorch, scikit learn and autogluon☆31Updated last month
- ☆67Updated last year
- Mechanistic Interpretability Visualizations using React☆224Updated last month
- PyPSDD porting to Python 3 + PyTorch equivalent tree construction.☆15Updated last year
- Mechanistic Interpretability for Transformer Models☆49Updated 2 years ago
- PAIR.withgoogle.com and friend's work on interpretability methods☆162Updated last week
- Extract full next-token probabilities via language model APIs☆229Updated 11 months ago
- Einsum with einops style variable names☆16Updated 8 months ago
- ☆20Updated last year
- ☆19Updated last year
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆18Updated last week
- Course resources and notes for the ESSLLI 2023 course on neural symbolic methods.☆16Updated last year
- (Model-written) LLM evals library☆17Updated last month
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆193Updated this week
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Updated last year
- Code for the ACL 2022 Paper "A Feasibility Study of Answer-Agnostic Question Generation for Education"☆17Updated 2 years ago
- Interpretability for sequence generation models 🐛 🔍☆396Updated 2 months ago
- Code to reproduce data for Bias in Bios☆43Updated last year
- ☆87Updated 2 years ago
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆177Updated last month