[NeurIPS 2023] Token-Scaled Logit Distillation for Ternary Weight Generative Language Models
☆18Dec 6, 2023Updated 2 years ago
Alternatives and similar repositories for TSLD
Users that are interested in TSLD are comparing it to the libraries listed below
Sorting:
- LLM Inference with Microscaling Format☆34Nov 12, 2024Updated last year
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆133May 16, 2024Updated last year
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- BESA is a differentiable weight pruning technique for large language models.☆17Mar 4, 2024Updated 2 years ago
- FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration☆20Jun 27, 2025Updated 8 months ago
- PB-LLM: Partially Binarized Large Language Models☆156Nov 20, 2023Updated 2 years ago
- ☆29Nov 29, 2023Updated 2 years ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆13Apr 29, 2025Updated 10 months ago
- ☆35Dec 22, 2025Updated 2 months ago
- [NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer☆30Dec 6, 2023Updated 2 years ago
- Pytorch implementation of our paper accepted by TPAMI 2023 — Lottery Jackpots Exist in Pre-trained Models☆35Jun 19, 2023Updated 2 years ago
- This repository contains the Parasol processor, which enables next-generation privacy preserving applications. Users can run arbitrary co…☆11Feb 25, 2026Updated last week
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆172Nov 26, 2025Updated 3 months ago
- ☆11May 24, 2024Updated last year
- ☆10Apr 24, 2024Updated last year
- ☆12Dec 19, 2023Updated 2 years ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆26Jun 16, 2025Updated 8 months ago
- Official repository for ICLR 2025 paper "Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs"☆16Mar 18, 2025Updated 11 months ago
- ☆52Nov 5, 2024Updated last year
- Multi-party computation utilities toolkit for rust☆16Dec 19, 2019Updated 6 years ago
- Unit testing React components with Tape in ES6☆12Feb 25, 2016Updated 10 years ago
- ☆11Apr 24, 2023Updated 2 years ago
- Code and demo for SFHTML5 Talk - D3 in Practice☆10Oct 1, 2015Updated 10 years ago
- [ICML 2025] Official PyTorch implementation of "NegMerge: Sign-Consensual Weight Merging for Machine Unlearning"☆14Nov 25, 2025Updated 3 months ago
- Rust UI components for GPUI☆14Nov 9, 2025Updated 3 months ago
- A noun representation in Rust☆11Mar 10, 2023Updated 2 years ago
- A from-scratch multi-difficulty-level tutorial on how pytorch, tensor flow, Jax, etc work☆13Feb 19, 2025Updated last year
- Official Implementation of Robustifying and Boosting Training-Free Neural Architecture Search☆10Mar 12, 2024Updated last year
- Graph.js is an MVC-like framework for building web applications using the graph data model☆13Oct 28, 2015Updated 10 years ago
- ☆11Apr 5, 2023Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- ☆10Oct 17, 2017Updated 8 years ago
- Wrapper around gremlin-node to provide out of the box support for Titan graph database☆23Aug 20, 2014Updated 11 years ago
- A sample Elixir application hosted on LING VM☆11Apr 13, 2013Updated 12 years ago
- ☆12Oct 4, 2024Updated last year
- The original PyTorch implementation of the "EXACT: How Train Your Accuracy"☆10Sep 22, 2022Updated 3 years ago
- RPC/XDR protocol compiler (from jungerl)☆14Oct 4, 2019Updated 6 years ago
- A simple REPL for Lean 4, returning information about errors and sorries.☆12Jun 19, 2023Updated 2 years ago