coreweave / tensorizer
Module, Model, and Tensor Serialization/Deserialization
☆187Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for tensorizer
- CUDA checkpoint and restore utility☆220Updated 6 months ago
- ☆266Updated 2 months ago
- ☆106Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆228Updated this week
- Google TPU optimizations for transformers models☆74Updated this week
- Getting Started with the CoreWeave Kubernetes GPU Cloud☆68Updated last week
- ☆156Updated last month
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆101Updated last week
- NVIDIA NCCL Tests for Distributed Training☆68Updated this week
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆332Updated 2 weeks ago
- ☆121Updated this week
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆163Updated this week
- ☆155Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆250Updated 3 weeks ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆152Updated this week
- This repository contains the experimental PyTorch native float8 training UX☆211Updated 3 months ago
- ☆24Updated this week
- Applied AI experiments and examples for PyTorch☆159Updated last week
- A library to analyze PyTorch traces.☆297Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆46Updated this week
- OpenAI compatible API for TensorRT LLM triton backend☆174Updated 3 months ago
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆132Updated 3 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆261Updated last year
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆80Updated this week
- FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.☆611Updated 2 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆146Updated this week
- Pipeline is an open source python SDK for building AI/ML workflows☆130Updated last month
- extensible collectives library in triton☆61Updated last month
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆190Updated 2 weeks ago