TheBloke / AutoGPTQLinks
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆38Updated 2 years ago
Alternatives and similar repositories for AutoGPTQ
Users that are interested in AutoGPTQ are comparing it to the libraries listed below
Sorting:
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆109Updated 2 years ago
- 4 bits quantization of LLaMa using GPTQ☆131Updated 2 years ago
- Load local LLMs effortlessly in a Jupyter notebook for testing purposes alongside Langchain or other agents. Contains Oobagooga and Kobol…☆213Updated 2 years ago
- Creates an Langchain Agent which uses the WebUI's API and Wikipedia to work☆74Updated 2 years ago
- oobaboga -text-generation-webui implementation of wafflecomposite - langchain-ask-pdf-local☆71Updated 2 years ago
- TheBloke's Dockerfiles☆308Updated last year
- A prompt/context management system☆170Updated 2 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆124Updated 2 years ago
- Visual Studio Code extension for WizardCoder☆149Updated 2 years ago
- Harnessing the Memory Power of the Camelids☆147Updated 2 years ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated 2 years ago
- Lord of LLMS☆294Updated 3 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated 2 years ago
- ☆74Updated 2 years ago
- Porting BabyAGI to Oobabooba.☆31Updated 2 years ago
- An Extension for oobabooga/text-generation-webui☆36Updated 2 years ago
- Host the GPTQ model using AutoGPTQ as an API that is compatible with text generation UI API.☆91Updated 2 years ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆38Updated 2 years ago
- ☆276Updated 2 years ago
- A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.☆71Updated 2 years ago
- Local LLM ReAct Agent with Guidance☆159Updated 2 years ago
- An OpenAI-like LLaMA inference API☆113Updated 2 years ago
- 💬 Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals client☆317Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- A repository to store helpful information and emerging insights in regard to LLMs☆21Updated 2 years ago
- Extension for using alternative GitHub Copilot (StarCoder API) in VSCode☆100Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- An Autonomous LLM Agent that runs on Wizcoder-15B