☆224Feb 21, 2023Updated 3 years ago
Alternatives and similar repositories for fly
Users that are interested in fly are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Butterfly matrix multiplication in PyTorch☆179Oct 5, 2023Updated 2 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- ☆32Jan 7, 2024Updated 2 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"☆44May 10, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code publication to the paper "Normalized Attention Without Probability Cage"☆17Nov 9, 2021Updated 4 years ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆33Jun 2, 2023Updated 2 years ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆563Dec 28, 2024Updated last year
- Reproducing RigL (ICML 2020) as a part of ML Reproducibility Challenge 2020☆29Jan 6, 2022Updated 4 years ago
- Code for testing DCT plus Sparse (DCTpS) networks☆14Jun 15, 2021Updated 4 years ago
- Parameter Efficient Transfer Learning with Diff Pruning☆74Feb 3, 2021Updated 5 years ago
- Code for the ICML 2021 and ICLR 2022 papers: Skew Orthogonal Convolutions, Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100☆18Feb 20, 2022Updated 4 years ago
- ☆15Apr 11, 2024Updated 2 years ago
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Jul 31, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆33Apr 12, 2021Updated 5 years ago
- ☆19Jun 3, 2023Updated 2 years ago
- End-to-end training of sparse deep neural networks with little-to-no performance loss.☆335Jan 26, 2023Updated 3 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- Blog post☆17Feb 16, 2024Updated 2 years ago
- AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION☆16Oct 27, 2023Updated 2 years ago
- ☆40Jan 5, 2024Updated 2 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 4 years ago
- ☆20May 30, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Structured Pruning Adapters in PyTorch☆19Aug 30, 2023Updated 2 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 11 months ago
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- ☆31Jul 2, 2023Updated 2 years ago
- Block-sparse primitives for PyTorch☆158Apr 5, 2021Updated 5 years ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆78Apr 2, 2024Updated 2 years ago
- ☆63Oct 3, 2024Updated last year
- [ICML 2021] "Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training" by Shiwei Liu, Lu Yin, De…☆45Nov 11, 2023Updated 2 years ago
- BlockCIrculantRNN (LSTM and GRU) using TensorFlow☆14Oct 30, 2018Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Sparsity support for PyTorch☆38Mar 22, 2025Updated last year
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆3,512Jul 17, 2025Updated 9 months ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆16Oct 11, 2021Updated 4 years ago
- [COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs☆65Mar 9, 2026Updated last month
- ☆26Nov 23, 2023Updated 2 years ago
- ☆716Dec 6, 2025Updated 4 months ago
- A Learnable LSH Framework for Efficient NN Training☆34Jul 22, 2021Updated 4 years ago