huggingface / safetensorsLinks
Simple, safe way to store and distribute tensors
β3,279Updated last week
Alternatives and similar repositories for safetensors
Users that are interested in safetensors are comparing it to the libraries listed below
Sorting:
- Accessible large language models via k-bit quantization for PyTorch.β7,088Updated this week
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β2,916Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β9,527Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β8,771Updated this week
- Fast and memory-efficient exact attentionβ17,572Updated last week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (Nβ¦β4,640Updated 2 months ago
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,009Updated 2 months ago
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blaβ¦β2,435Updated last week
- PyTorch native post-training libraryβ5,217Updated this week
- Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".β2,119Updated last year
- A machine learning compiler for GPUs, CPUs, and ML acceleratorsβ3,197Updated this week
- Transformer related optimization, including BERT, GPTβ6,173Updated last year
- PyTorch extensions for high performance and large scale training.β3,322Updated last month
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Accelerationβ3,029Updated 3 weeks ago
- PyTorch native quantization and sparsity for training and inferenceβ2,064Updated this week
- Python bindings for the Transformer models implemented in C/C++ using GGML library.β1,864Updated last year
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,567Updated last year
- Tools for merging pretrained large language models.β5,754Updated last week
- Training and serving large-scale neural networks with auto parallelization.β3,134Updated last year
- 4 bits quantization of LLaMA using GPTQβ3,050Updated 10 months ago
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and β¦β1,350Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β4,856Updated last month
- Foundation Architecture for (M)LLMsβ3,076Updated last year
- Tensor library for machine learningβ12,591Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,446Updated 11 months ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β12,008Updated 5 months ago
- Development repository for the Triton language and compilerβ15,687Updated this week
- Sparsity-aware deep learning inference runtime for CPUsβ3,147Updated this week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.β5,960Updated last month
- Cramming the training of a (BERT-type) language model into limited compute.β1,332Updated 11 months ago