Codes for the paper The emergence of clusters in self-attention dynamics.
☆17Dec 18, 2023Updated 2 years ago
Alternatives and similar repositories for 2023-transformers
Users that are interested in 2023-transformers are comparing it to the libraries listed below
Sorting:
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆29Feb 6, 2026Updated 3 weeks ago
- Projection operator method for statistical data analysis☆10Mar 11, 2025Updated 11 months ago
- Valentine's Day Anonymous matching☆10Jul 25, 2014Updated 11 years ago
- ☆19Jan 2, 2026Updated 2 months ago
- Dynamic mode decomposition in Python☆13Jun 9, 2015Updated 10 years ago
- 大数中医☆11Jul 10, 2024Updated last year
- Clustered Compositional Embeddings☆11Oct 25, 2023Updated 2 years ago
- Conditional Linear Dynamical Systems☆15Oct 7, 2025Updated 4 months ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Basic implementation of variational autoencoders in Torch☆10Apr 16, 2016Updated 9 years ago
- Estimators for Information Theoretic Functionals using Influence Functions☆11Apr 17, 2016Updated 9 years ago
- API Utility for TOR(The Onion ROUTER) such as requesting a new IP, or generating API password. Uses Network API for control☆12Feb 27, 2025Updated last year
- Implementation of Variance Reduction Techniques in Julia☆11Sep 6, 2016Updated 9 years ago
- ☆10Apr 26, 2024Updated last year
- ☆10Feb 19, 2019Updated 7 years ago
- Official code for "IT³: Idempotent Test-Time Training" (ICML 2025)☆14Jun 25, 2025Updated 8 months ago
- [IJCV 2022] Domain-Specific Bias Filtering for Single Labeled Domain Generalization☆12Nov 10, 2022Updated 3 years ago
- Least Squares Regression for subspace clustering☆10May 27, 2018Updated 7 years ago
- Efficient implementation of Generative Stochastic Networks☆12Nov 28, 2013Updated 12 years ago
- Implementation of Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems☆14Nov 11, 2023Updated 2 years ago
- ☆12Sep 16, 2024Updated last year
- Don't just regulate gradients like in Muon, regulate the weights too☆31Jul 30, 2025Updated 7 months ago
- naive bayesian,knn java demo☆14Aug 29, 2013Updated 12 years ago
- ☆24Sep 3, 2025Updated 6 months ago
- Ἀνατομή is a PyTorch library to analyze representation of neural networks☆13Jan 31, 2024Updated 2 years ago
- ☆11Oct 11, 2024Updated last year
- ☆14Feb 25, 2019Updated 7 years ago
- General purpose code and examples for the mpEDMD algorithm.☆16Nov 16, 2023Updated 2 years ago
- ☆12Jan 17, 2024Updated 2 years ago
- A chrome extension to highlight trans erasure☆13Feb 14, 2025Updated last year
- Material for ODL workshop Dec 2017☆12Dec 26, 2017Updated 8 years ago
- What if you could imitate a famous celebrity's voice or sing like a famous singer? This project started with a goal to convert someone's …☆14Sep 30, 2022Updated 3 years ago
- Implementation for "An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Pl…☆15Oct 10, 2018Updated 7 years ago
- Find context neurons in Pythia models.☆13Jun 13, 2023Updated 2 years ago
- Jump ReLU☆11Apr 8, 2019Updated 6 years ago
- ☆17Jul 9, 2025Updated 7 months ago
- Unofficial Scalable-Softmax Is Superior for Attention☆20May 30, 2025Updated 9 months ago
- In this repository we have all the codes that we have developed☆12Sep 13, 2023Updated 2 years ago
- An implementation of LazyLLM token pruning for LLaMa 2 model family.☆13Jan 6, 2025Updated last year