huggingface / safetensorsLinks
Simple, safe way to store and distribute tensors
β3,434Updated this week
Alternatives and similar repositories for safetensors
Users that are interested in safetensors are comparing it to the libraries listed below
Sorting:
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,066Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,567Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β9,912Updated 2 weeks ago
- A machine learning compiler for GPUs, CPUs, and ML acceleratorsβ3,472Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,116Updated this week
- PyTorch extensions for high performance and large scale training.β3,369Updated 4 months ago
- PyTorch native quantization and sparsity for training and inferenceβ2,309Updated last week
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,052Updated 2 months ago
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (Nβ¦β4,671Updated 2 weeks ago
- Transformer related optimization, including BERT, GPTβ6,295Updated last year
- Fast inference engine for Transformer modelsβ4,012Updated 5 months ago
- Tensor library for machine learningβ13,134Updated this week
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blaβ¦β2,707Updated this week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,583Updated last year
- PyTorch native post-training libraryβ5,458Updated this week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.β6,085Updated 2 weeks ago
- Python bindings for the Transformer models implemented in C/C++ using GGML library.β1,876Updated last year
- Fast and memory-efficient exact attentionβ19,385Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β4,938Updated 5 months ago
- Inference Llama 2 in one file of pure π₯β2,117Updated last year
- PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily wriβ¦β1,405Updated this week
- β2,880Updated last week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,648Updated last year
- Large Language Model Text Generation Inferenceβ10,477Updated last week
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:β2,240Updated 4 months ago
- The hub for EleutherAI's work on interpretability and learning dynamicsβ2,609Updated 3 months ago
- Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".β2,179Updated last year
- Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.β2,101Updated this week
- Minimalistic large language model 3D-parallelism trainingβ2,180Updated last week
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.β2,898Updated last year