WapaMario63 / GPTQ-for-LLaMa-ROCmLinks
4 bits quantization of LLaMA using GPTQ, ported to HIP for use in AMD GPUs.
☆32Updated last year
Alternatives and similar repositories for GPTQ-for-LLaMa-ROCm
Users that are interested in GPTQ-for-LLaMa-ROCm are comparing it to the libraries listed below
Sorting:
- ☆37Updated 2 years ago
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆109Updated last year
- DEPRECATED!☆52Updated 11 months ago
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Updated 9 months ago
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆204Updated 3 months ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated last year
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆153Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆245Updated last year
- 8-bit CUDA functions for PyTorch, ported to HIP for use in AMD GPUs☆49Updated 2 years ago
- 8-bit CUDA functions for PyTorch Rocm compatible☆41Updated last year
- Text WebUI extension to add clever Notebooks to Chat mode☆139Updated last year
- 4 bits quantization of LLaMa using GPTQ☆129Updated 2 years ago
- ☆75Updated this week
- ☆158Updated last year
- An extension for oobabooga's text-generation-webui that adds syntax highlighting to code snippets☆67Updated last year
- A prompt/context management system☆170Updated 2 years ago
- CHAracter State Management - a generative text adventure☆43Updated this week
- A KoboldAI-like memory extension for oobabooga's text-generation-webui☆109Updated 7 months ago
- A manual for helping using tesla p40 gpu☆126Updated 6 months ago
- ☆53Updated last year
- GPU Power and Performance Manager☆59Updated 7 months ago
- Provide a way to use the GPT-QLLama model as an API☆43Updated 2 years ago
- Creates an Langchain Agent which uses the WebUI's API and Wikipedia to work☆74Updated last year
- Web UI for ExLlamaV2☆495Updated 4 months ago
- CHAracter State Management - a generative text adventure (engine)☆65Updated 7 months ago
- A simple Gradio WebUI for loading/unloading models and loras in tabbyAPI.☆20Updated 6 months ago
- A fast batching API to serve LLM models☆181Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆104Updated last year
- A fork of vLLM enabling Pascal architecture GPUs☆28Updated 3 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆63Updated last year