Efficient 3bit/4bit quantization of LLaMA models
☆18May 18, 2023Updated 3 years ago
Alternatives and similar repositories for RPTQ-for-LLaMA
Users that are interested in RPTQ-for-LLaMA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fork of kingoflolz/mesh-transformer-jax with memory usage optimizations and support for GPT-Neo, GPT-NeoX, BLOOM, OPT and fairseq dense L…☆22Nov 14, 2022Updated 3 years ago
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆71Mar 30, 2023Updated 3 years ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 3 years ago
- Prompt Jinja2 templates for LLMs☆35Jul 9, 2025Updated 10 months ago
- BigKnow2022: Bringing Language Models Up to Speed☆16Mar 27, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 3 years ago
- The official service back-end.☆13Apr 8, 2023Updated 3 years ago
- C/C++ implementation of PygmalionAI/pygmalion-6b☆55Apr 18, 2023Updated 3 years ago
- Simple Image Viewer with ability to tag images, search by tags, and mark regions for AI training☆12Mar 19, 2024Updated 2 years ago
- The Pygmalion Docs☆19Sep 16, 2023Updated 2 years ago
- Precompiled Wheels for GPTQ-for-LLaMa☆19Jul 26, 2023Updated 2 years ago
- An unsupervised model merging algorithm for Transformers-based language models.☆108Apr 29, 2024Updated 2 years ago
- ☆97Mar 28, 2026Updated 2 months ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆124Jun 16, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Persistent Kernel + JIT-Injected Operators (CUDA)☆47Jan 27, 2026Updated 4 months ago
- BFloat16 Fused Adam Operator for PyTorch☆19Nov 16, 2024Updated last year
- Writings and Games by David Schirduan☆16Mar 31, 2026Updated last month
- ☆40Mar 25, 2023Updated 3 years ago
- Small repository for my video on LoRA☆16May 14, 2023Updated 3 years ago
- Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separat…☆15Apr 23, 2026Updated last month
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Aug 20, 2024Updated last year
- ☆536Dec 1, 2023Updated 2 years ago
- Turns KoboldAI into a crowdsourced distributed cluster☆33Oct 19, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- SDXL GPU cluster scripts☆16Oct 28, 2023Updated 2 years ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆38May 14, 2024Updated 2 years ago
- A C++ fork/rewrite of the smhasher project to bring Murmurhash v.3 to the Linux shell and to the PHP scripting language.☆21Jul 25, 2011Updated 14 years ago
- gimme karmas now☆11Sep 9, 2021Updated 4 years ago
- LLM Powered discord bot, Character Card enabled Chat page, Stable Diffusion discord bot, and overall AI tool. All from one app, TalOS: Re…☆34Oct 20, 2024Updated last year
- ☆32May 23, 2025Updated last year
- A device-independent random number generator☆18Apr 27, 2024Updated 2 years ago
- An easy-to-use package for implementing SmoothQuant for LLMs☆111Apr 7, 2025Updated last year
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10May 6, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆19Sep 1, 2025Updated 8 months ago
- Image Diffusion block merging technique applied to transformers based Language Models.☆56May 8, 2023Updated 3 years ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆38Jun 6, 2023Updated 2 years ago
- UnitEval is a benchmarking and evaluation tools for AutoDev Coder.☆14Jan 2, 2024Updated 2 years ago
- annoy long term memory experiment for oobabooga/text-generation-webui☆30Jul 17, 2023Updated 2 years ago
- Quickly configure *arr apps☆31Dec 9, 2022Updated 3 years ago
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆12Apr 18, 2025Updated last year