bitsandbytes-foundation / bitsandbytesLinks

Accessible large language models via k-bit quantization for PyTorch.

☆7,400

Alternatives and similar repositories for bitsandbytes

Users that are interested in bitsandbytes are comparing it to the libraries listed below

Sorting:

AutoGPTQ / AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆4,905Updated 3 months ago
facebookresearch / xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
☆9,788Updated this week
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,583Updated last year
EleutherAI / lm-evaluation-harness
A framework for few-shot evaluation of language models.
☆9,706Updated this week
mit-han-lab / llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
☆3,181Updated 2 weeks ago
Dao-AILab / flash-attention
Fast and memory-efficient exact attention
☆18,656Updated this week
huggingface / accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…
☆8,971Updated last week
NVIDIA / FasterTransformer
Transformer related optimization, including BERT, GPT
☆6,261Updated last year
IST-DASLab / gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
☆2,151Updated last year
huggingface / peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆19,184Updated this week
huggingface / optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization…
☆2,998Updated this week
arcee-ai / mergekit
Tools for merging pretrained large language models.
☆6,122Updated this week
microsoft / LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
☆12,443Updated 7 months ago
pytorch / torchtune
PyTorch native post-training library
☆5,366Updated this week
huggingface / trl
Train transformer language models with reinforcement learning.
☆14,736Updated this week
huggingface / text-generation-inference
Large Language Model Text Generation Inference
☆10,367Updated last week
casper-hansen / AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
☆2,221Updated 2 months ago
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,081Updated last month
CarperAI / trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,688Updated last year
qwopqwop200 / GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
☆3,061Updated last year
EleutherAI / pythia
The hub for EleutherAI's work on interpretability and learning dynamics
☆2,575Updated last month
yizhongw / self-instruct
Aligning pretrained language models with instruction data generated by themselves.
☆4,437Updated 2 years ago
microsoft / LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
☆4,081Updated last month
NVIDIA / Megatron-LM
Ongoing research training transformer models at scale
☆13,010Updated this week
ModelTC / LightLLM
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…
☆3,412Updated this week
OpenGVLab / LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,889Updated last year
deepspeedai / DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
☆2,042Updated last month
axolotl-ai-cloud / axolotl
Go ahead and axolotl questions
☆10,038Updated last week
huggingface / safetensors
Simple, safe way to store and distribute tensors
☆3,369Updated 3 weeks ago
mit-han-lab / streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆6,949Updated last year