foundation-model-stack / fastsafetensorsLinks
High-performance safetensors model loader
☆34Updated last week
Alternatives and similar repositories for fastsafetensors
Users that are interested in fastsafetensors are comparing it to the libraries listed below
Sorting:
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆169Updated this week
- NVIDIA Inference Xfer Library (NIXL)☆352Updated this week
- extensible collectives library in triton☆87Updated 2 months ago
- CUDA checkpoint and restore utility☆339Updated 4 months ago
- The driver for LMCache core to run in vLLM☆39Updated 3 months ago
- ☆26Updated last month
- DeeperGEMM: crazy optimized version☆69Updated 3 weeks ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆150Updated this week
- ☆49Updated 2 months ago
- ☆25Updated 2 months ago
- Module, Model, and Tensor Serialization/Deserialization☆232Updated last week
- Fast and memory-efficient exact attention☆71Updated 3 weeks ago
- ☆35Updated 4 months ago
- ☆83Updated 5 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆75Updated 8 months ago
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆109Updated 10 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆127Updated this week
- A lightweight design for computation-communication overlap.☆131Updated 3 weeks ago
- ☆214Updated this week
- ☆79Updated 6 months ago
- Perplexity GPU Kernels☆318Updated last week
- ☆85Updated 2 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆116Updated this week
- KV cache store for distributed LLM inference☆250Updated this week
- Applied AI experiments and examples for PyTorch☆270Updated this week
- ☆61Updated 3 months ago
- ☆46Updated 11 months ago
- ☆193Updated 3 weeks ago
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆45Updated last week
- Zero Bubble Pipeline Parallelism☆395Updated 3 weeks ago