huggingface / safetensors
Simple, safe way to store and distribute tensors
☆2,871Updated this week
Related projects ⓘ
Alternatives and complementary repositories for safetensors
- Accessible large language models via k-bit quantization for PyTorch.☆6,244Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆8,612Updated this week
- 🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools☆2,556Updated this week
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆2,690Updated this week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆4,553Updated 2 weeks ago
- Fast and memory-efficient exact attention☆14,109Updated this week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆7,911Updated this week
- ☆2,676Updated last week
- Train transformer language models with reinforcement learning.☆9,967Updated this week
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.☆1,891Updated this week
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,811Updated 9 months ago
- Transformer related optimization, including BERT, GPT☆5,871Updated 7 months ago
- Large Language Model Text Generation Inference☆9,011Updated this week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,530Updated 8 months ago
- Fast inference engine for Transformer models☆3,383Updated this week
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆3,629Updated last week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆2,498Updated 3 weeks ago
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs…☆1,936Updated this week
- Development repository for the Triton language and compiler☆13,311Updated this week
- Foundation Architecture for (M)LLMs☆3,025Updated 6 months ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆2,401Updated 2 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆5,919Updated this week
- PyTorch native finetuning library☆4,267Updated this week
- PyTorch extensions for high performance and large scale training.☆3,187Updated 2 months ago
- An open-source framework for training large multimodal models.☆3,732Updated 2 months ago
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain…☆8,604Updated this week
- A framework for few-shot evaluation of language models.☆6,904Updated this week
- Tensor library for machine learning☆11,160Updated this week