AlpinDale / RPTQ-for-LLaMAView external linksLinks
Efficient 3bit/4bit quantization of LLaMA models
☆18May 18, 2023Updated 2 years ago
Alternatives and similar repositories for RPTQ-for-LLaMA
Users that are interested in RPTQ-for-LLaMA are comparing it to the libraries listed below
Sorting:
- Conversion script adapting vicuna dataset into alpaca format for use with oobabooga's trainer☆13Jun 21, 2023Updated 2 years ago
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆71Mar 30, 2023Updated 2 years ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 2 years ago
- SimplePIM is the first high-level programming framework for real-world processing-in-memory (PIM) architectures. Described in the PACT 20…☆31Oct 23, 2023Updated 2 years ago
- ☆40Mar 25, 2023Updated 2 years ago
- ☆11Aug 20, 2025Updated 5 months ago
- ☆13May 21, 2023Updated 2 years ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10May 6, 2024Updated last year
- BFloat16 Fused Adam Operator for PyTorch☆16Nov 16, 2024Updated last year
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 9 months ago
- UnitEval is a benchmarking and evaluation tools for AutoDev Coder.☆13Jan 2, 2024Updated 2 years ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 2 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated 10 months ago
- This project implements the Titans architecture from the paper "Titans: Learning to Memorize at Test Time" for market data prediction.☆11Jan 19, 2025Updated last year
- This is the code repo for our paper "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression".☆12Feb 27, 2024Updated last year
- [NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference☆17Jun 19, 2025Updated 7 months ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago
- SYSU-ARCH is a LAB that focuses on the use and extending of simulators.☆10Dec 19, 2022Updated 3 years ago
- LCA-on-the-line (ICML 2024 Oral)☆13Feb 13, 2025Updated last year
- ☆20Aug 14, 2025Updated 6 months ago
- ☆14Jul 12, 2021Updated 4 years ago
- ☆11May 18, 2025Updated 8 months ago
- ☆13Jan 22, 2025Updated last year
- An implementation of LazyLLM token pruning for LLaMa 2 model family.☆13Jan 6, 2025Updated last year
- SDXL GPU cluster scripts☆16Oct 28, 2023Updated 2 years ago
- KDSS is the framework for knowledge distillation from LLMs☆12Nov 5, 2025Updated 3 months ago
- Source code of “Reinforcement Learning with Token-level Feedback for Controllable Text Generation (NAACL 2024)☆17Dec 8, 2024Updated last year
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆23Oct 14, 2025Updated 4 months ago
- Reflect-RL: Two-Player Online RL Fine-Tuning for LMs☆18Jul 19, 2025Updated 6 months ago
- Easy to use library for face detection and alignment on Android☆12Dec 11, 2020Updated 5 years ago
- ☆15Apr 11, 2024Updated last year
- quick and dirty benchmark for TFLite gles delegate on iOS☆12Aug 10, 2021Updated 4 years ago
- Real time traffic sign classification using deep learning☆13May 8, 2017Updated 8 years ago
- ☆73Dec 16, 2025Updated last month
- Joe's Data Structures Library (JDL)☆13Oct 30, 2024Updated last year
- ☆20Aug 13, 2024Updated last year
- A C++ fork/rewrite of the smhasher project to bring Murmurhash v.3 to the Linux shell and to the PHP scripting language.☆21Jul 25, 2011Updated 14 years ago
- ☆21Jul 3, 2025Updated 7 months ago