coreweave / tensorizer
Module, Model, and Tensor Serialization/Deserialization
☆217Updated last month
Alternatives and similar repositories for tensorizer:
Users that are interested in tensorizer are comparing it to the libraries listed below
- ☆170Updated last week
- CUDA checkpoint and restore utility☆306Updated last month
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆297Updated this week
- ☆296Updated 7 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆262Updated 5 months ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆351Updated this week
- FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.☆771Updated 6 months ago
- ☆30Updated last week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆102Updated this week
- ☆116Updated last year
- A safetensors extension to efficiently store sparse quantized tensors on disk☆91Updated this week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆232Updated 2 weeks ago
- ☆237Updated last week
- PyTorch per step fault tolerance (actively under development)☆266Updated this week
- A library to analyze PyTorch traces.☆348Updated last week
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆272Updated last year
- The Triton backend for the PyTorch TorchScript models.☆144Updated last week
- This repository contains the experimental PyTorch native float8 training UX☆222Updated 7 months ago
- Fast low-bit matmul kernels in Triton☆263Updated this week
- OpenAI compatible API for TensorRT LLM triton backend☆201Updated 7 months ago
- Google TPU optimizations for transformers models☆103Updated 2 months ago
- ☆180Updated 5 months ago
- Home for OctoML PyTorch Profiler☆108Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆189Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆122Updated 3 weeks ago
- ☆190Updated last month
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆54Updated last month
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 5 months ago
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆289Updated last month
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆177Updated this week