GPU operators for sparse tensor operations
☆35Mar 11, 2024Updated last year
Alternatives and similar repositories for sparse_gpu_operator
Users that are interested in sparse_gpu_operator are comparing it to the libraries listed below
Sorting:
- The codes for training sparsity predictor on LLaMA.☆18May 12, 2024Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- ☆352Apr 2, 2024Updated last year
- ☆22Dec 15, 2023Updated 2 years ago
- ☆27Aug 25, 2023Updated 2 years ago
- Pytorch implementation for Hypformer: Exploring Efficient Hyperbolic Transformer Fully in Hyperbolic Space (KDD 2024)☆36Aug 17, 2025Updated 6 months ago
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆120Mar 6, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Aug 14, 2024Updated last year
- ☆32Nov 11, 2024Updated last year
- Code release for AdapMoE accepted by ICCAD 2024☆35Apr 28, 2025Updated 10 months ago
- ☆34Feb 6, 2026Updated 3 weeks ago
- Official code for ICLR 2024 paper "Do Generated Data Always Help Contrastive Learning?"☆31Apr 4, 2024Updated last year
- ☆31Jun 15, 2022Updated 3 years ago
- A tool designed to compare energy and emission costs between computer chips☆13Dec 9, 2023Updated 2 years ago
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- For releasing code related to compression methods for transformers, accompanying our publications☆454Jan 16, 2025Updated last year
- [NeurIPS'24 LanGame workshop] On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆42Jul 7, 2025Updated 7 months ago
- ☆159Feb 15, 2025Updated last year
- [ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache☆358Nov 20, 2025Updated 3 months ago
- Triton-based implementation of Sparse Mixture of Experts.☆266Oct 3, 2025Updated 4 months ago
- Bamboo-7B Large Language Model☆93Mar 28, 2024Updated last year
- [MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Se…☆816Mar 6, 2025Updated 11 months ago
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆372Jul 10, 2025Updated 7 months ago
- (READ ONLY MIRROR) The ProB Model Checker and Animator Plugin for Rodin☆19Updated this week
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆13May 16, 2025Updated 9 months ago
- ☆12Jun 17, 2025Updated 8 months ago
- Python implementation of the Huffman Code compression algorithm.☆14Apr 18, 2013Updated 12 years ago
- ☆29Nov 19, 2025Updated 3 months ago
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- ☆16Jul 23, 2023Updated 2 years ago
- Distributed Communication-Optimal Shuffle and Transpose Algorithm☆14Feb 20, 2026Updated last week
- [NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization☆402Aug 13, 2024Updated last year
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 2 years ago
- An easy-to-use package for implementing SmoothQuant for LLMs☆110Apr 7, 2025Updated 10 months ago
- DEPRECATED. This Scalapck repository is deprecated. The last version in this repository is 3.0. Refer to "aocl-scalapack" repository unde …☆10Mar 15, 2021Updated 4 years ago
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.☆16Apr 8, 2025Updated 10 months ago
- Configuration of the GFortran compiler to use it with Abaqus☆14Aug 7, 2022Updated 3 years ago
- Extension for stable diffusion webui to add advance prompt tuning☆10Nov 13, 2022Updated 3 years ago