huggingface / safetensorsLinks
Simple, safe way to store and distribute tensors
β3,338Updated last week
Alternatives and similar repositories for safetensors
Users that are interested in safetensors are comparing it to the libraries listed below
Sorting:
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β2,967Updated last week
- Accessible large language models via k-bit quantization for PyTorch.β7,212Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β9,696Updated this week
- A machine learning compiler for GPUs, CPUs, and ML acceleratorsβ3,327Updated this week
- PyTorch native quantization and sparsity for training and inferenceβ2,168Updated this week
- Development repository for the Triton language and compilerβ16,114Updated this week
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,026Updated last week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (Nβ¦β4,655Updated 3 months ago
- β2,839Updated last month
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blaβ¦β2,529Updated last week
- Transformer related optimization, including BERT, GPTβ6,231Updated last year
- Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.β1,996Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β8,914Updated this week
- PyTorch extensions for high performance and large scale training.β3,337Updated 2 months ago
- PyTorch native post-training libraryβ5,306Updated this week
- π€ Evaluate: A library for easily evaluating machine learning models and datasets.β2,254Updated 3 weeks ago
- Large Language Model Text Generation Inferenceβ10,311Updated this week
- Inference Llama 2 in one file of pure π₯β2,115Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.β6,011Updated 3 months ago
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β4,890Updated 3 months ago
- Python bindings for the Transformer models implemented in C/C++ using GGML library.β1,866Updated last year
- Fast and memory-efficient exact attentionβ18,252Updated this week
- Tensor library for machine learningβ12,808Updated this week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,575Updated last year
- β‘ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Plβ¦β2,167Updated 9 months ago
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:β2,202Updated 2 months ago
- Fast inference engine for Transformer modelsβ3,902Updated 3 months ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platformβ1,900Updated this week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Accelerationβ3,127Updated last month
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Rβ¦β2,449Updated this week