Efficient 3bit/4bit quantization of LLaMA models
☆18May 18, 2023Updated 2 years ago
Alternatives and similar repositories for RPTQ-for-LLaMA
Users that are interested in RPTQ-for-LLaMA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Conversion script adapting vicuna dataset into alpaca format for use with oobabooga's trainer☆13Jun 21, 2023Updated 2 years ago
- Fork of kingoflolz/mesh-transformer-jax with memory usage optimizations and support for GPT-Neo, GPT-NeoX, BLOOM, OPT and fairseq dense L…☆22Nov 14, 2022Updated 3 years ago
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆71Mar 30, 2023Updated 3 years ago
- Create, edit and convert AI character files for CharacterAI, Pygmalion, Text Generation, KoboldAI and TavernAI☆23Dec 4, 2023Updated 2 years ago
- BigKnow2022: Bringing Language Models Up to Speed☆16Mar 27, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- C/C++ implementation of PygmalionAI/pygmalion-6b☆55Apr 18, 2023Updated 3 years ago
- gui for Merge-Stable-Diffusion-models-without-distortion-gui☆36Dec 31, 2022Updated 3 years ago
- The Pygmalion Docs☆19Sep 16, 2023Updated 2 years ago
- Precompiled Wheels for GPTQ-for-LLaMa☆19Jul 26, 2023Updated 2 years ago
- SimplePIM is the first high-level programming framework for real-world processing-in-memory (PIM) architectures. Described in the PACT 20…☆32Oct 23, 2023Updated 2 years ago
- An Android Application for GLCC☆11Sep 30, 2022Updated 3 years ago
- CMake configurations for PPL projects☆12Aug 10, 2024Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆108Apr 29, 2024Updated last year
- ☆94Mar 28, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆125Jun 16, 2023Updated 2 years ago
- BFloat16 Fused Adam Operator for PyTorch☆19Nov 16, 2024Updated last year
- Xenoblade 3 research☆14Dec 9, 2025Updated 4 months ago
- A simple Gradio WebUI for loading/unloading models and loras in tabbyAPI.☆20Nov 21, 2024Updated last year
- ☆40Mar 25, 2023Updated 3 years ago
- A repurpose of a Counter-Strike: Global Offensive cheat for in-game data collection and dataset creation.☆15Jan 15, 2022Updated 4 years ago
- Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separat…☆14Updated this week
- Unofficial implementation of Semantic-aware Guidance (S-CFG) for ComfyUI☆12Aug 8, 2024Updated last year
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Aug 20, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37May 14, 2024Updated last year
- ☆20Jun 1, 2023Updated 2 years ago
- ☆536Dec 1, 2023Updated 2 years ago
- View shadertoy shaders on your keyboard, save them and use them as your keyboard background animation!☆10Dec 14, 2016Updated 9 years ago
- SDXL GPU cluster scripts☆16Oct 28, 2023Updated 2 years ago
- ☆33Apr 23, 2023Updated 2 years ago
- A C++ fork/rewrite of the smhasher project to bring Murmurhash v.3 to the Linux shell and to the PHP scripting language.☆21Jul 25, 2011Updated 14 years ago
- gimme karmas now☆11Sep 9, 2021Updated 4 years ago
- ☆31May 23, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- My proprietary procedure. Caffe implementation of SSD and SSDLite detection on MobileNetv2, converted from tensorflow.☆23Mar 20, 2019Updated 7 years ago
- What do CLIP Vision Transformers learn? Feature Visualization can show you!☆15Aug 29, 2024Updated last year
- A device-independent random number generator☆18Apr 27, 2024Updated last year
- ☆23Aug 7, 2021Updated 4 years ago
- GPTQ inference TVM kernel☆40Apr 25, 2024Updated last year
- For SDXL, SD1.5, Flux. Nuke T5 and let CLIP guide Flux.1 on its own! Or let let random guide Flux.1! Or load a CLIP crazy opinion embeddi…☆25Aug 5, 2025Updated 8 months ago
- An easy-to-use package for implementing SmoothQuant for LLMs☆111Apr 7, 2025Updated last year