huggingface / safetensorsLinks
Simple, safe way to store and distribute tensors
β3,403Updated last week
Alternatives and similar repositories for safetensors
Users that are interested in safetensors are comparing it to the libraries listed below
Sorting:
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,023Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,490Updated last week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β9,866Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,055Updated this week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (Nβ¦β4,670Updated last week
- Transformer related optimization, including BERT, GPTβ6,274Updated last year
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,581Updated last year
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,046Updated last month
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blaβ¦β2,645Updated this week
- Python bindings for the Transformer models implemented in C/C++ using GGML library.β1,874Updated last year
- PyTorch native quantization and sparsity for training and inferenceβ2,251Updated this week
- Large Language Model Text Generation Inferenceβ10,424Updated last week
- PyTorch extensions for high performance and large scale training.β3,361Updated 3 months ago
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β4,922Updated 4 months ago
- A machine learning compiler for GPUs, CPUs, and ML acceleratorsβ3,426Updated this week
- Fast and memory-efficient exact attentionβ18,997Updated this week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Accelerationβ3,207Updated last month
- β2,863Updated 2 months ago
- Tensor library for machine learningβ13,017Updated last week
- π€ Evaluate: A library for easily evaluating machine learning models and datasets.β2,295Updated last week
- Development repository for the Triton language and compilerβ16,568Updated this week
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.β2,894Updated last year
- Training and serving large-scale neural networks with auto parallelization.β3,148Updated last year
- Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".β2,163Updated last year
- Fast inference engine for Transformer modelsβ3,965Updated 4 months ago
- A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.β2,766Updated 2 months ago
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizatiβ¦β11,365Updated this week
- PyTorch native post-training libraryβ5,418Updated this week
- A pytorch quantization backend for optimumβ984Updated last month
- 4 bits quantization of LLaMA using GPTQβ3,065Updated last year