facebookresearch / projUNN
Fast training of unitary deep network layers from low-rank updates
☆28Updated 2 years ago
Alternatives and similar repositories for projUNN:
Users that are interested in projUNN are comparing it to the libraries listed below
- ☆52Updated 4 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆28Updated last year
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated last year
- ☆51Updated 8 months ago
- ☆30Updated 2 months ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- FID computation in Jax/Flax.☆27Updated 7 months ago
- Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)☆18Updated last year
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Updated last year
- Automatically take good care of your preemptible TPUs☆36Updated last year
- [TMLR 2022] Curvature access through the generalized Gauss-Newton's low-rank structure: Eigenvalues, eigenvectors, directional derivative…☆17Updated last year
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆75Updated last year
- ☆49Updated last year
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆82Updated last year
- ☆36Updated 3 years ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆53Updated 2 years ago
- Blog post☆16Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆35Updated last year
- Parallelizing non-linear sequential models over the sequence length☆50Updated last month
- Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"☆27Updated 4 years ago
- Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers" (NeurIPS 2021)☆48Updated last year
- Official repository for the paper "Zero-Shot AutoML with Pretrained Models"☆43Updated last year
- ☆37Updated last year
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆59Updated 3 years ago
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆21Updated 3 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆60Updated 4 months ago
- AdaCat☆49Updated 2 years ago
- A selection of neural network models ported from torchvision for JAX & Flax.☆44Updated 4 years ago
- MaskedTensors for PyTorch☆39Updated 2 years ago