Efficient 3bit/4bit quantization of LLaMA models
☆18May 18, 2023Updated 3 years ago
Alternatives and similar repositories for RPTQ-for-LLaMA
Users that are interested in RPTQ-for-LLaMA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Conversion script adapting vicuna dataset into alpaca format for use with oobabooga's trainer☆13Jun 21, 2023Updated 2 years ago
- LLM RP TUI for Power Users.☆35Jan 13, 2026Updated 5 months ago
- Fork of kingoflolz/mesh-transformer-jax with memory usage optimizations and support for GPT-Neo, GPT-NeoX, BLOOM, OPT and fairseq dense L…☆22Nov 14, 2022Updated 3 years ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 3 years ago
- BigKnow2022: Bringing Language Models Up to Speed☆16Mar 27, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 3 years ago
- The official service back-end.☆13Apr 8, 2023Updated 3 years ago
- gui for Merge-Stable-Diffusion-models-without-distortion-gui☆36Dec 31, 2022Updated 3 years ago
- The Pygmalion Docs☆19Sep 16, 2023Updated 2 years ago
- Precompiled Wheels for GPTQ-for-LLaMa☆19Jul 26, 2023Updated 2 years ago
- A powerful GUI app and Toolkit for Claude Code - Create custom agents, manage interactive Claude Code sessions, run secure background age…☆13Jul 18, 2025Updated 11 months ago
- SimplePIM is the first high-level programming framework for real-world processing-in-memory (PIM) architectures. Described in the PACT 20…☆35Oct 23, 2023Updated 2 years ago
- An Android Application for GLCC☆11Sep 30, 2022Updated 3 years ago
- CMake configurations for PPL projects☆12Aug 10, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An unsupervised model merging algorithm for Transformers-based language models.☆108Apr 29, 2024Updated 2 years ago
- ☆97Mar 28, 2026Updated 2 months ago
- M78星云机场官网地址☆13Nov 20, 2025Updated 6 months ago
- BFloat16 Fused Adam Operator for PyTorch☆19Nov 16, 2024Updated last year
- A simple Gradio WebUI for loading/unloading models and loras in tabbyAPI.☆20Nov 21, 2024Updated last year
- Writings and Games by David Schirduan☆16Mar 31, 2026Updated 2 months ago
- ☆40Mar 25, 2023Updated 3 years ago
- Share your GPU without MIG or MPS☆51Jan 27, 2026Updated 4 months ago
- Small repository for my video on LoRA☆16May 14, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separat…☆15Apr 23, 2026Updated last month
- Unofficial implementation of Semantic-aware Guidance (S-CFG) for ComfyUI☆13Aug 8, 2024Updated last year
- Simple local all-in-one install for IDEA2.ART☆26Jan 8, 2023Updated 3 years ago
- Turns KoboldAI into a crowdsourced distributed cluster☆34Oct 19, 2023Updated 2 years ago
- SDXL GPU cluster scripts☆16Oct 28, 2023Updated 2 years ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆38May 14, 2024Updated 2 years ago
- A C++ fork/rewrite of the smhasher project to bring Murmurhash v.3 to the Linux shell and to the PHP scripting language.☆21Jul 25, 2011Updated 14 years ago
- gimme karmas now☆11Sep 9, 2021Updated 4 years ago
- LLM Powered discord bot, Character Card enabled Chat page, Stable Diffusion discord bot, and overall AI tool. All from one app, TalOS: Re…☆34Oct 20, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Creating Interactive and Embedded Physics Simulations from Static Textbook Diagrams☆31Mar 18, 2025Updated last year
- Compare Savant and PyTorch performance☆13Feb 9, 2024Updated 2 years ago
- My proprietary procedure. Caffe implementation of SSD and SSDLite detection on MobileNetv2, converted from tensorflow.☆23Mar 20, 2019Updated 7 years ago
- What do CLIP Vision Transformers learn? Feature Visualization can show you!☆15Aug 29, 2024Updated last year
- A device-independent random number generator☆18Apr 27, 2024Updated 2 years ago
- ☆23Aug 7, 2021Updated 4 years ago
- GPTQ inference TVM kernel☆40Apr 25, 2024Updated 2 years ago