WapaMario63 / GPTQ-for-LLaMa-ROCmLinks
4 bits quantization of LLaMA using GPTQ, ported to HIP for use in AMD GPUs.
☆32Updated 2 years ago
Alternatives and similar repositories for GPTQ-for-LLaMa-ROCm
Users that are interested in GPTQ-for-LLaMa-ROCm are comparing it to the libraries listed below
Sorting:
- DEPRECATED!☆50Updated last year
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆217Updated last week
- Generate Large Language Model text in a container.☆20Updated 2 years ago
- Falcon LLM ggml framework with CPU and GPU support☆249Updated 2 years ago
- ☆155Updated 2 years ago
- A manual for helping using tesla p40 gpu☆142Updated last year
- A prompt/context management system☆168Updated 2 years ago
- Web UI for ExLlamaV2☆513Updated last year
- TheBloke's Dockerfiles☆308Updated last year
- An autonomous AI agent extension for Oobabooga's web ui☆173Updated 2 years ago
- 8-bit CUDA functions for PyTorch, ported to HIP for use in AMD GPUs☆53Updated 2 years ago
- A community list of common phrases generated by GPT and Claude models☆79Updated 2 years ago
- Wheels for llama-cpp-python compiled with cuBLAS support☆102Updated 2 years ago
- ☆535Updated 2 years ago
- A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion.☆310Updated 2 years ago
- 4 bits quantization of LLaMa using GPTQ☆131Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated 2 years ago
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Updated last year
- BabyAGI to run with locally hosted models using the API from https://github.com/oobabooga/text-generation-webui☆87Updated 2 years ago
- LLM that combines the principles of wizardLM and vicunaLM☆716Updated 2 years ago
- 8-bit CUDA functions for PyTorch Rocm compatible☆41Updated last year
- ☆36Updated 2 years ago
- ☆54Updated 2 years ago
- Prometheus exporter for Linux based GDDR6/GDDR6X VRAM and GPU Core Hot spot temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆24Updated last year
- Make PyTorch models at least run on APUs.☆56Updated 2 years ago
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆109Updated 2 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆125Updated 2 years ago
- Docker configuration for koboldcpp☆41Updated last year
- ☆50Updated 2 years ago
- Docker variants of oobabooga's text-generation-webui, including pre-built images.☆447Updated 3 months ago