☆222Feb 21, 2023Updated 3 years ago
Alternatives and similar repositories for fly
Users that are interested in fly are comparing it to the libraries listed below
Sorting:
- Butterfly matrix multiplication in PyTorch☆178Oct 5, 2023Updated 2 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- ☆32Jan 7, 2024Updated 2 years ago
- AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION☆14Oct 27, 2023Updated 2 years ago
- [NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference☆17Nov 6, 2024Updated last year
- train with kittens!☆63Oct 25, 2024Updated last year
- Code publication to the paper "Normalized Attention Without Probability Cage"☆17Nov 9, 2021Updated 4 years ago
- Code for testing DCT plus Sparse (DCTpS) networks☆15Jun 15, 2021Updated 4 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 9 months ago
- A collection of research papers on efficient training of DNNs☆69Jul 6, 2022Updated 3 years ago
- ☆20May 30, 2024Updated last year
- ☆40Jan 5, 2024Updated 2 years ago
- Code for paper: End-to-end Stochastic Optimization with Energy-based Model☆16Feb 14, 2023Updated 3 years ago
- AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)☆23Oct 15, 2024Updated last year
- ☆11Oct 11, 2023Updated 2 years ago
- Explanation Optimization☆13Oct 16, 2020Updated 5 years ago
- Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)☆10Feb 21, 2023Updated 3 years ago
- Reproducing RigL (ICML 2020) as a part of ML Reproducibility Challenge 2020☆29Jan 6, 2022Updated 4 years ago
- ☆31Jul 2, 2023Updated 2 years ago
- ☆35Apr 12, 2024Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆33Jun 2, 2023Updated 2 years ago
- The accompanying code for "Simplifying and Understanding State Space Models with Diagonal Linear RNNs" (Ankit Gupta, Harsh Mehta, Jonatha…☆23Dec 30, 2022Updated 3 years ago
- Code for the ICML 2021 and ICLR 2022 papers: Skew Orthogonal Convolutions, Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100☆18Feb 20, 2022Updated 4 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- A Learnable LSH Framework for Efficient NN Training☆34Jul 22, 2021Updated 4 years ago
- ☆20Jun 3, 2023Updated 2 years ago
- ☆19Jul 6, 2023Updated 2 years ago
- ☆12Sep 26, 2019Updated 6 years ago
- Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"☆16Oct 24, 2022Updated 3 years ago
- ☆14Jul 12, 2022Updated 3 years ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆16Oct 11, 2021Updated 4 years ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆116Nov 30, 2022Updated 3 years ago
- End-to-end training of sparse deep neural networks with little-to-no performance loss.☆335Jan 26, 2023Updated 3 years ago
- [CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong C…☆25Mar 9, 2022Updated 3 years ago
- ☆133Mar 23, 2021Updated 4 years ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆562Dec 28, 2024Updated last year
- Pytorch library for fast transformer implementations☆1,762Mar 23, 2023Updated 2 years ago
- ☆53May 20, 2024Updated last year
- [ICLR 2022] "Sparsity Winning Twice: Better Robust Generalization from More Efficient Training" by Tianlong Chen*, Zhenyu Zhang*, Pengjun…☆40Mar 20, 2022Updated 3 years ago