spfrommer / torchexplorer
Interactively inspect module inputs, outputs, parameters, and gradients.
☆323Updated last month
Alternatives and similar repositories for torchexplorer:
Users that are interested in torchexplorer are comparing it to the libraries listed below
- Helpful tools and examples for working with flex-attention☆635Updated this week
- TensorDict is a pytorch dedicated tensor container.☆879Updated this week
- Package for extracting and mapping the results of every single tensor operation in a PyTorch model in one line of code.☆547Updated last week
- Annotated version of the Mamba paper☆473Updated 11 months ago
- torchview: visualize pytorch models☆872Updated last week
- For optimization algorithm research and development.☆491Updated this week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆514Updated this week
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆385Updated 2 months ago
- ☆253Updated 5 months ago
- Transform datasets at scale. Optimize datasets for fast AI model training.☆413Updated this week
- Implementation of the proposed minGRU in Pytorch☆279Updated last week
- ☆288Updated 2 months ago
- Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"☆417Updated 2 months ago
- A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…☆160Updated 3 weeks ago
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆370Updated last year
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆364Updated last week
- The AdEMAMix Optimizer: Better, Faster, Older.☆177Updated 5 months ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆296Updated last month
- LoRA and DoRA from Scratch Implementations☆196Updated 11 months ago
- FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)☆382Updated 8 months ago
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆354Updated last week
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆120Updated 6 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆547Updated last month
- When it comes to optimizers, it's always better to be safe than sorry☆179Updated 3 weeks ago
- Implementation of Diffusion Transformer (DiT) in JAX☆265Updated 8 months ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆162Updated 8 months ago
- Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.☆379Updated 8 months ago
- Muon optimizer: +~30% sample efficiency with <3% wallclock overhead☆253Updated last week
- xLSTM as Generic Vision Backbone☆463Updated 3 months ago