kssteven418 / SqueezeLLM-gradientsView external linksLinks
☆21Feb 5, 2024Updated 2 years ago
Alternatives and similar repositories for SqueezeLLM-gradients
Users that are interested in SqueezeLLM-gradients are comparing it to the libraries listed below
Sorting:
- [ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs☆18Jun 3, 2025Updated 8 months ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆39Mar 11, 2024Updated last year
- [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization☆713Aug 13, 2024Updated last year
- [COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎:https://zhuanlan.zhihu.c…☆29Mar 5, 2025Updated 11 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆23Mar 15, 2024Updated last year
- DeiT implementation for Q-ViT☆25Apr 21, 2025Updated 9 months ago
- Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)☆50Jul 6, 2025Updated 7 months ago
- [NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization☆404Aug 13, 2024Updated last year
- HFODetector is Python package that that is capable of detecting HFOs with STE / MNI / Hilbert detector. Detection speed is increased by u…☆12Feb 16, 2025Updated last year
- ☆43Jan 30, 2024Updated 2 years ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆44Apr 18, 2025Updated 9 months ago
- ☆11May 24, 2024Updated last year
- Remote Audio Data iOS SDK☆11Aug 19, 2020Updated 5 years ago
- Code repository for the ECCV 2022 (Oral) paper "Cartoon Explanations of Image Classifiers"☆10Nov 24, 2025Updated 2 months ago
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆56Nov 20, 2024Updated last year
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated last year
- https://icml.cc/virtual/2023/poster/24354☆10Aug 15, 2023Updated 2 years ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆41Aug 4, 2023Updated 2 years ago
- A probabilistic CKY parser for PCFGs☆19Mar 12, 2014Updated 11 years ago
- Minimum viable code for the Decodable Information Bottleneck paper. Pytorch Implementation.☆11Oct 20, 2020Updated 5 years ago
- ☆11Apr 5, 2023Updated 2 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- ☆10May 6, 2024Updated last year
- Create reliability diagrams to quantify ML calibration.☆10Feb 1, 2022Updated 4 years ago
- OpenAI GPT model to build your personal assistant in IoT devices. Just like Alexa, Google Assistant, Siri, etc. but with your own skills,…☆12Aug 7, 2023Updated 2 years ago
- Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"☆14May 31, 2023Updated 2 years ago
- Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".☆21May 23, 2025Updated 8 months ago
- ☆11Apr 24, 2025Updated 9 months ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- Drag & drop UI to build your customized LLM flow☆13Updated this week
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆12Jun 7, 2023Updated 2 years ago
- EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆27Jul 30, 2025Updated 6 months ago
- ☆10Apr 21, 2022Updated 3 years ago
- ☆13Jul 14, 2025Updated 7 months ago
- Sample code to show how to create an in-memory RAG☆10Mar 10, 2024Updated last year
- ☆10Nov 16, 2024Updated last year
- 😎 Awesome papers on token redundancy reduction☆11Mar 12, 2025Updated 11 months ago
- Toolkit in Python for the acquisition, analysis and visualization of motion capture using IMU☆14May 19, 2021Updated 4 years ago
- Utility functions for weights and biases (wandb).☆11Sep 17, 2024Updated last year