facebookresearch / bitsandbytesLinks

Library for 8-bit optimizers and quantization routines.

☆780

Alternatives and similar repositories for bitsandbytes

Users that are interested in bitsandbytes are comparing it to the libraries listed below

Sorting:

huggingface / pytorch_block_sparse
Fast Block Sparse Matrices for Pytorch
☆547Updated 4 years ago
huggingface / nn_pruning
Prune a model while finetuning or training.
☆405Updated 3 years ago
ELS-RD / kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…
☆1,586Updated last year
BlackHC / toma
Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory
☆438Updated last year
NVIDIA / PyProf
A GPU performance profiling tool for PyTorch models
☆508Updated 4 years ago
kaiyuyue / torchshard
Slicing a PyTorch Tensor Into Parallel Shards
☆301Updated 4 months ago
pytorch / ort
Accelerate PyTorch models with ONNX Runtime
☆365Updated 8 months ago
microsoft / fastformers
FastFormers - highly efficient transformer models for NLU
☆706Updated 7 months ago
pytorch / torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
☆1,064Updated last year
Lightning-Universe / lightning-transformers
Flexible components pairing 🤗 Transformers with Pytorch Lightning
☆612Updated 2 years ago
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆276Updated 3 years ago
archinetai / surgeon-pytorch
A library to inspect and extract intermediate layers of PyTorch models.
☆475Updated 3 years ago
cybertronai / pytorch-lamb
Implementation of https://arxiv.org/abs/1904.00962
☆377Updated 4 years ago
pytorch / nestedtensor
[Prototype] Tools for the concurrent manipulation of variably sized Tensors.
☆251Updated 2 years ago
tunib-ai / parallelformers
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
☆790Updated 2 years ago
meta-pytorch / data
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
☆1,228Updated last week
Stonesjtu / pytorch_memlab
Profiling and inspecting memory in pytorch
☆1,073Updated last month
google-research / long-range-arena
Long Range Arena for Benchmarking Efficient Transformers
☆767Updated last year
facebookresearch / recipes
Recipes are a standard, well supported set of blueprints for machine learning engineers to rapidly train models using the latest research…
☆329Updated last week
microsoft / mup
maximal update parametrization (µP)
☆1,611Updated last year
pytorch / functorch
functorch is JAX-like composable function transforms for PyTorch.
☆1,436Updated 2 months ago
JonasGeiping / cramming
Cramming the training of a (BERT-type) language model into limited compute.
☆1,349Updated last year
facebookresearch / diffq
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …
☆236Updated 2 years ago
PhilJd / contiguous_pytorch_params
Accelerate training by storing parameters in one contiguous chunk of memory.
☆293Updated 5 years ago
AminRezaei0x443 / memory-efficient-attention
Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch
☆182Updated 2 years ago
lucidrains / RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
☆874Updated last year
ELS-RD / transformer-deploy
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
☆1,690Updated last year
ofirpress / attention_with_linear_biases
Code for the ALiBi method for transformer language models (ICLR 2022)
☆544Updated last year
lukemelas / do-you-even-need-attention
Is the attention layer even necessary? (https://arxiv.org/abs/2105.02723)
☆485Updated 4 years ago
mlpen / Nystromformer
☆383Updated 2 years ago