AniZpZ/smoothquant

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AniZpZ/smoothquant)

AniZpZ / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

☆11

Alternatives and similar repositories for smoothquant

Users that are interested in smoothquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Adlik / smoothquantplus
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆23Mar 15, 2024Updated 2 years ago
baoy-nlp / DSS-VAE-pytorch
View on GitHub
Generating Sentences from Disentangled Syntactic and Semantic Spaces
☆11Jun 24, 2019Updated 7 years ago
ROCm / torch_migraphx
View on GitHub
Torch-MIGraphX integrates AMD's graph inference engine with the PyTorch ecosystem.
☆20Jul 13, 2026Updated last week
kq-chen / VLMEvalKit
View on GitHub
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
☆15Feb 17, 2025Updated last year
jeromerobert / hmat-oss
View on GitHub
A hierarchical matrix C/C++ library
☆27Jul 10, 2026Updated last week
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
sgl-project / DeepGEMM
View on GitHub
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
☆32Updated this week
mlverse / tfevents
View on GitHub
Write events for TensorBoard
☆13Apr 27, 2026Updated 2 months ago
Wollender / gitlab-docker-k8s
View on GitHub
基于GitLab+Docker+K8S的持续集成和交付
☆25Jan 24, 2024Updated 2 years ago
PlusLabNLP / ECONET
View on GitHub
Public codebase for ECONET: EMNLP'21
☆12Mar 11, 2022Updated 4 years ago
qdrant / demo-cloud-faq
View on GitHub
Demo of fine-tuning QA models for answering FAQ of cloud providers documentation
☆11Jun 20, 2026Updated last month
Sreyan88 / Disfluency-Detection-with-Span-Classification
View on GitHub
This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…
☆14Jun 6, 2023Updated 3 years ago
kssteven418 / SqueezeLLM-gradients
View on GitHub
☆21Feb 5, 2024Updated 2 years ago
vedantroy / gpu_kernels
View on GitHub
☆27Jan 8, 2024Updated 2 years ago
sufenlp / MiLoRA
View on GitHub
[NAACL 2025] MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
☆21May 31, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
shijiew / QwenSpinQuant
View on GitHub
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆15Mar 20, 2025Updated last year
Yangkeloff / Node.js-practice
View on GitHub
Node.js 练习
☆10Mar 3, 2019Updated 7 years ago
taasnim / unified-coherence-model
View on GitHub
☆15Mar 28, 2022Updated 4 years ago
dlwns147 / amq
View on GitHub
[EMNLP 2025] AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
☆16Apr 29, 2026Updated 2 months ago
markgw / gaussianlda
View on GitHub
Gaussian LDA training implemented in Python
☆12Apr 5, 2021Updated 5 years ago
vidhishanair / structured-text-representations
View on GitHub
Pytorch implementation of : "Learning Structured Text Representations"
☆11Oct 1, 2019Updated 6 years ago
MayDomine / Seq1F1B
View on GitHub
Sequence-level 1F1B schedule for LLMs.
☆19Jun 4, 2024Updated 2 years ago
agentica-project / verl
View on GitHub
☆17Mar 30, 2026Updated 3 months ago
samyfodil / taubyte-llama-satellite
View on GitHub
☆18Feb 2, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
vakovalskii / cursor_agent_flow
View on GitHub
cursor logs with gpt-4o using litellm proxy
☆14Sep 9, 2025Updated 10 months ago
dsuess / mediapipe-pytorch
View on GitHub
☆10Aug 3, 2022Updated 3 years ago
harshmangalam / qwik-spin-delay
View on GitHub
Smart spinner component for Qwik, to manage the duration of loading states.
☆13Sep 25, 2023Updated 2 years ago
Manuel030 / alpaca-opt
View on GitHub
Yet another LLM
☆10Apr 6, 2023Updated 3 years ago
robvanvolt / DALLE-tools
View on GitHub
DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.
☆14Mar 9, 2022Updated 4 years ago
Karbo123 / pytorch_grouped_gemm
View on GitHub
High Performance Grouped GEMM in PyTorch
☆30May 10, 2022Updated 4 years ago
bethelmelesse / UnifiedCrawl
View on GitHub
☆17Nov 26, 2024Updated last year
steven2358 / mlx
View on GitHub
Machine Learning Explorations - A list of machine learning resources
☆33May 9, 2023Updated 3 years ago
TroyDoesAI / AI_Research
View on GitHub
My Gen AI research
☆11Jun 3, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
schnell18 / lm-quant-toolkit
View on GitHub
LLM Quantization toolkit
☆20Jun 11, 2026Updated last month
kuleshov-group / MODULoRA-Experiment
View on GitHub
Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…
☆13Dec 5, 2023Updated 2 years ago
PlusLabNLP / ESTER
View on GitHub
public repo for ESTER dataset and modeling (EMNLP'21)
☆20Feb 2, 2022Updated 4 years ago
dylibso / chainsocket
View on GitHub
Proof of concept for a generative AI application framework powered by WebAssembly and Extism
☆14Aug 10, 2023Updated 2 years ago
chenyu-jiang / nsys2json
View on GitHub
A Python script to convert the output of NVIDIA Nsight Systems (in SQLite format) to JSON in Google Chrome Trace Event Format.
☆60Aug 5, 2025Updated 11 months ago
AniZpZ / AutoSmoothQuant
View on GitHub
An easy-to-use package for implementing SmoothQuant for LLMs
☆111Apr 7, 2025Updated last year
YashasSamaga / ConvolutionBuildingBlocks
View on GitHub
GEMM and Winograd based convolutions using CUTLASS
☆28Jul 15, 2020Updated 6 years ago