agrocylo / bitsandbytes-rocm
8-bit CUDA functions for PyTorch, ported to HIP for use in AMD GPUs
☆49Updated 2 years ago
Alternatives and similar repositories for bitsandbytes-rocm:
Users that are interested in bitsandbytes-rocm are comparing it to the libraries listed below
- 8-bit CUDA functions for PyTorch Rocm compatible☆39Updated last year
- 8-bit CUDA functions for PyTorch☆48Updated 2 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated 11 months ago
- DEPRECATED!☆52Updated 10 months ago
- ☆37Updated last year
- 4 bits quantization of LLMs using GPTQ☆49Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆86Updated this week
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- Fast and memory-efficient exact attention☆171Updated this week
- Instruct-tuning LLaMA on consumer hardware☆66Updated 2 years ago
- A simple converter which converts pytorch bin files to safetensor, intended to be used for LLM conversion.☆65Updated last year
- ☆535Updated last year
- Text WebUI extension to add clever Notebooks to Chat mode☆139Updated last year
- Merge Transformers language models by use of gradient parameters.☆206Updated 8 months ago
- Image Diffusion block merging technique applied to transformers based Language Models.☆54Updated last year
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆310Updated last year
- Simple monkeypatch to boost AMD Navi 3 GPUs☆38Updated this week
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated last year
- A KoboldAI-like memory extension for oobabooga's text-generation-webui☆108Updated 5 months ago
- An extension for oobabooga's text-generation-webui that adds syntax highlighting to code snippets☆67Updated 10 months ago
- 4 bits quantization of LLaMA using GPTQ, ported to HIP for use in AMD GPUs.☆32Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- ChatGPT-like Web UI for RWKVstic☆100Updated 2 years ago
- Efficient 3bit/4bit quantization of LLaMA models☆19Updated last year
- 4 bits quantization of SantaCoder using GPTQ☆51Updated last year
- ☆13Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated 11 months ago
- Just a simple HowTo for https://github.com/johnsmith0031/alpaca_lora_4bit☆31Updated last year
- ☆156Updated last year
- A prompt/context management system☆170Updated last year