mattneary / attentionLinks
visualizing attention for LLM users
☆237Updated last year
Alternatives and similar repositories for attention
Users that are interested in attention are comparing it to the libraries listed below
Sorting:
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆225Updated last year
- Extract full next-token probabilities via language model APIs☆248Updated last year
- Evaluating LLMs with fewer examples☆170Updated last year
- Mass-editing thousands of facts into a transformer memory (ICLR 2023)☆532Updated last year
- ☆301Updated 2 years ago
- Code accompanying "How I learned to start worrying about prompt formatting".☆113Updated 6 months ago
- ☆559Updated last year
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆221Updated last week
- Controlled Text Generation via Language Model Arithmetic☆224Updated last year
- ☆200Updated 8 months ago
- A package to generate summaries of long-form text and evaluate the coherence of these summaries. Official package for our ICLR 2024 paper…☆128Updated last year
- Benchmarking LLMs with Challenging Tasks from Real Users☆246Updated last year
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆196Updated 10 months ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆134Updated 10 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆278Updated last year
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆145Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆233Updated last year
- code for training & evaluating Contextual Document Embedding models☆201Updated 7 months ago
- Tools for understanding how transformer predictions are built layer-by-layer☆554Updated 4 months ago
- DSIR large-scale data selection framework for language model training☆266Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆202Updated last year
- ☆258Updated 9 months ago
- awesome synthetic (text) datasets☆315Updated last month
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆263Updated 10 months ago
- ☆313Updated last year
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆323Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆253Updated last year
- Scaling Data-Constrained Language Models☆343Updated 5 months ago
- Functional Benchmarks and the Reasoning Gap☆90Updated last year
- RuLES: a benchmark for evaluating rule-following in language models☆244Updated 10 months ago