4 bits quantization of LLMs using GPTQ
☆49Jul 26, 2023Updated 2 years ago
Alternatives and similar repositories for GPTQ-for-LLaMa
Users that are interested in GPTQ-for-LLaMa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Oobabooga extension for Bark TTS☆119Nov 23, 2023Updated 2 years ago
- A new repo to demonstrate tutorials for using HuggingFace on Graphcore IPUs.☆12May 3, 2023Updated 3 years ago
- A port of the RWKV v7 language model, implemented with the Burn deep learning framework☆14Jun 9, 2025Updated last year
- An extension to Oobabooga to add a simple memory function for chat☆25Jun 5, 2023Updated 3 years ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆11Apr 26, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A version of BabyAGI with numpy instead of pinecone and an evaluation agent to check success criteria☆15Apr 18, 2023Updated 3 years ago
- Simple OpenGL canvas/event handling library☆14May 7, 2024Updated 2 years ago
- [OBSOLETE] Extensions API for SillyTavern.☆686Dec 10, 2024Updated last year
- ☆12Mar 21, 2024Updated 2 years ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 3 years ago
- ☆11Mar 10, 2023Updated 3 years ago
- Teensy Audio Library☆14Jul 20, 2020Updated 5 years ago
- Go command line app to exploit file upload vulnerability☆12Feb 8, 2017Updated 9 years ago
- Discord integration for the oobabooga's text-generation-webui☆13Apr 27, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆16Jun 6, 2023Updated 3 years ago
- ☆13Jul 12, 2024Updated last year
- ☆13Aug 6, 2024Updated last year
- Learn Japanese using music. Frontend written in Nuxt.js and optional backend using Litserve☆21Jun 2, 2025Updated last year
- 🎙️ P³: Lightning-fast podcast processing with Apple Silicon optimization and local LLMs. Parakeet MLX transcription + Ollama analysis = …☆30Aug 25, 2025Updated 9 months ago
- JAX implementation of GPTQ quantization algorithm☆10Jul 19, 2023Updated 2 years ago
- Simple Android SDK for Publitio☆10Jan 16, 2021Updated 5 years ago
- ☆40Mar 25, 2023Updated 3 years ago
- Inverse Kinematics demystify☆13Jun 16, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A library for calibrating classifiers and computing calibration metrics☆14Nov 28, 2022Updated 3 years ago
- Peach - the porn organizer☆12Jun 10, 2024Updated last year
- [DEPRECATED] Attempts to convert a Flux lora to a Chroma lora☆21Nov 9, 2025Updated 7 months ago
- Codebase for the arxiver dataset☆14Nov 29, 2024Updated last year
- Fluentd output plugin that sends events to Amazon Kinesis Streams and Amazon Kinesis Firehose.☆13Apr 2, 2023Updated 3 years ago
- A simple extension that uses Bark Text-to-Speech for audio output☆11Nov 20, 2023Updated 2 years ago
- [Exclusive for GitHub] deep-muse: Advanced Text-to-Music Generator Implementation☆16Mar 17, 2022Updated 4 years ago
- arxiv.org api for scientific papers☆11Oct 12, 2015Updated 10 years ago
- Node-RED Flow (and web page example) for the LLaMA AI model☆11Jul 27, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Story understanding and plot analysis pilot.☆10Dec 27, 2022Updated 3 years ago
- Re-implementation of Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling (T. Shen et al., ICLR 2018) on P…☆42Feb 22, 2018Updated 8 years ago
- Create embeddings for LLM using the Nomic API☆23Nov 21, 2024Updated last year
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆12Mar 18, 2023Updated 3 years ago
- Neural coreference resolution☆12Sep 3, 2024Updated last year
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆41Dec 21, 2025Updated 5 months ago
- A Next.js chatbot app demonstrating seamless integration with window.ai.☆15Jun 25, 2023Updated 2 years ago