spfrommer / torchexplorer
Interactively inspect module inputs, outputs, parameters, and gradients.
☆330Updated 3 months ago
Alternatives and similar repositories for torchexplorer:
Users that are interested in torchexplorer are comparing it to the libraries listed below
- Annotated version of the Mamba paper☆478Updated last year
- Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"☆421Updated 3 months ago
- Helpful tools and examples for working with flex-attention☆701Updated 2 weeks ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆308Updated 3 months ago
- torchview: visualize pytorch models☆898Updated 3 weeks ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆524Updated last month
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆229Updated 2 months ago
- When it comes to optimizers, it's always better to be safe than sorry☆216Updated this week
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆399Updated 3 months ago
- TensorHue is a Python library that allows you to visualize tensors right in your console, making understanding and debugging tensor conte…☆114Updated last month
- A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…☆167Updated 2 months ago
- Package for extracting and mapping the results of every single tensor operation in a PyTorch model in one line of code.☆568Updated 3 weeks ago
- TensorDict is a pytorch dedicated tensor container.☆901Updated this week
- Torch nn vizualization☆51Updated last year
- A pytorch quantization backend for optimum☆910Updated 3 weeks ago
- ☆152Updated last year
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆374Updated last year
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆539Updated last week
- For optimization algorithm research and development.☆502Updated this week
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆223Updated last month
- The AdEMAMix Optimizer: Better, Faster, Older.☆179Updated 6 months ago
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆364Updated last month
- ☆289Updated 3 months ago
- Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA☆778Updated this week
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆175Updated 9 months ago
- ☆261Updated last month
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆276Updated 2 weeks ago
- ☆182Updated this week
- depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.☆614Updated 3 months ago
- Build high-performance AI models with modular building blocks☆492Updated this week