foundation-model-stack / fastsafetensors
High-performance safetensors model loader
☆18Updated this week
Alternatives and similar repositories for fastsafetensors:
Users that are interested in fastsafetensors are comparing it to the libraries listed below
- Perplexity GPU Kernels☆134Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆134Updated 2 weeks ago
- extensible collectives library in triton☆84Updated this week
- ☆21Updated last month
- A minimal implementation of vllm.☆37Updated 8 months ago
- Make triton easier☆47Updated 9 months ago
- ☆30Updated last week
- Home for OctoML PyTorch Profiler☆108Updated last year
- Explore training for quantized models☆17Updated 2 months ago
- LLM Serving Performance Evaluation Harness☆73Updated last month
- NVIDIA Inference Xfer Library (NIXL)☆230Updated this week
- Applied AI experiments and examples for PyTorch☆251Updated 2 weeks ago
- Fast low-bit matmul kernels in Triton☆275Updated this week
- The driver for LMCache core to run in vLLM☆36Updated 2 months ago
- Module, Model, and Tensor Serialization/Deserialization☆220Updated last month
- Boosting 4-bit inference kernels with 2:4 Sparsity☆72Updated 7 months ago
- ☆49Updated 4 months ago
- Pygloo provides Python bindings for Gloo.☆21Updated last month
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- PyTorch centric eager mode debugger☆46Updated 3 months ago
- A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL☆19Updated 2 weeks ago
- ☆54Updated 6 months ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆65Updated 3 years ago
- This repository contains the experimental PyTorch native float8 training UX☆222Updated 8 months ago
- ☆192Updated last week
- Cloud Native Benchmarking of Foundation Models☆24Updated 4 months ago
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆60Updated 2 months ago
- ☆54Updated last month
- A safetensors extension to efficiently store sparse quantized tensors on disk☆92Updated last week
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆107Updated this week