☆39Aug 27, 2024Updated last year
Alternatives and similar repositories for GRIFFIN
Users that are interested in GRIFFIN are comparing it to the libraries listed below
Sorting:
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆16Nov 1, 2021Updated 4 years ago
- ☆32Nov 11, 2024Updated last year
- ☆159Feb 15, 2025Updated last year
- Code for the accelerated SDE☆12Sep 18, 2024Updated last year
- ☆11Sep 20, 2024Updated last year
- [CVPRW 2023] "Many-Task Federated Learning: A New Problem Setting and A Simple Baseline" by Ruisi Cai, Xiaohan Chen, Shiwei Liu, Jayanth …☆13Aug 28, 2023Updated 2 years ago
- ☆14Jun 4, 2024Updated last year
- Implementation of our PR 2020 paper:Unsupervised Text-to-Image Synthesis☆13Jul 9, 2020Updated 5 years ago
- ☆52Nov 5, 2024Updated last year
- Lightning support for Intel Habana accelerators.☆27Aug 1, 2025Updated 7 months ago
- ☆34Aug 27, 2025Updated 6 months ago
- ☆15Feb 21, 2024Updated 2 years ago
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆39Feb 4, 2025Updated last year
- [ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆24Mar 16, 2025Updated 11 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆81Jul 7, 2025Updated 7 months ago
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 8 months ago
- [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs☆228Jan 11, 2025Updated last year
- ☆26Nov 23, 2023Updated 2 years ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆369Apr 22, 2025Updated 10 months ago
- ☆28Feb 21, 2025Updated last year
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆70Jan 6, 2024Updated 2 years ago
- Neural Fixed-Point Acceleration for Convex Optimization☆29Oct 6, 2022Updated 3 years ago
- ☆55Jul 7, 2025Updated 7 months ago
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation☆32Nov 16, 2024Updated last year
- [NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exitin…☆66Jun 26, 2024Updated last year
- ☆31Mar 23, 2024Updated last year
- Official implementation of the ICLR 2024 paper AffineQuant☆28Mar 30, 2024Updated last year
- code associated with ACL 2021 DExperts paper☆118May 24, 2023Updated 2 years ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆122Jul 4, 2025Updated 8 months ago
- Unofficial implementations of block/layer-wise pruning methods for LLMs.☆77Apr 29, 2024Updated last year
- Code for ICLR 2022 Paper, "Controlling Directions Orthogonal to a Classifier"☆35Jun 6, 2023Updated 2 years ago
- ☆82Nov 11, 2024Updated last year
- RL with Experience Replay☆55Jul 27, 2025Updated 7 months ago
- ☆77Apr 29, 2024Updated last year
- PB-LLM: Partially Binarized Large Language Models☆156Nov 20, 2023Updated 2 years ago
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆473Apr 21, 2024Updated last year
- MaskedTensors for PyTorch☆39Jul 17, 2022Updated 3 years ago
- Recycling diverse models☆46Jan 18, 2023Updated 3 years ago
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆374Jul 10, 2025Updated 7 months ago