okarthikb / attention-visualizerLinks

LLM attention pattern visualizer

☆10

Alternatives and similar repositories for attention-visualizer

Users that are interested in attention-visualizer are comparing it to the libraries listed below

Sorting:

joey00072 / ohara
Collection of autoregressive model implementation
☆85Updated 2 months ago
tyler-romero / microR1
Simple repository for training small reasoning models
☆33Updated 5 months ago
yash-srivastava19 / arrakis
Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
☆31Updated 2 months ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆18Updated 5 months ago
pacman100 / peft-codegen-25
☆23Updated 2 years ago
srush / LLM-Talk
☆51Updated last year
KaiNylund / lm-weights-encode-time
☆68Updated 11 months ago
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated 10 months ago
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆19Updated last year
CarperAI / treasure_trove
☆22Updated last year
huggingface / peft-pytorch-conference
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆14Updated last year
RobertCsordas / moeut
☆82Updated 10 months ago
shreyansh26 / Attention-Mask-Patterns
Using FlexAttention to compute attention with different masking patterns
☆44Updated 9 months ago
joey00072 / microjax
Jax like function transformation engine but micro, microjax
☆33Updated 8 months ago
euclaise / supertrainer2000
☆49Updated last year
Upaya07 / NeurIPS-llm-efficiency-challenge
Code for NeurIPS LLM Efficiency Challenge
☆59Updated last year
YuchenJin / llm.c
LLM training in simple, raw C/CUDA
☆15Updated 7 months ago
evanatyourservice / llm-jax
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆17Updated 3 months ago
vivien000 / regex-constrained-decoding
Fast, High-Fidelity LLM Decoding with Regex Constraints
☆20Updated 11 months ago
okarthikb / state-space-models
☆27Updated last year
luyug / magix
Supercharge huggingface transformers with model parallelism.
☆77Updated 9 months ago
argilla-io / distilabel-spin-dibt
Repository containing the SPIN experiments on the DIBT 10k ranked prompts
☆24Updated last year
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆43Updated last year
TristanThrush / i-am-a-strange-dataset
Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"
☆44Updated last year
google-deepmind / asyncdiloco
☆45Updated last year
catid / lllm
Latent Large Language Models
☆18Updated 10 months ago
NielsRogge / awesome-huggingface
Repository containing awesome resources regarding Hugging Face tooling.
☆47Updated last year
kiddyboots216 / lottery-ticket-adaptation
Lottery Ticket Adaptation
☆39Updated 7 months ago
google-research-datasets / QAmeleon
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…
☆34Updated last year
facebookresearch / lss_eval
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Updated last year