sgl-project / sglang-jaxLinks
JAX backend for SGL
☆175Updated this week
Alternatives and similar repositories for sglang-jax
Users that are interested in sglang-jax are comparing it to the libraries listed below
Sorting:
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆252Updated 4 months ago
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆230Updated last week
- ring-attention experiments☆155Updated last year
- How to ensure correctness and ship LLM generated kernels in PyTorch☆121Updated last week
- extensible collectives library in triton☆91Updated 7 months ago
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…☆68Updated 2 months ago
- Allow torch tensor memory to be released and resumed later☆167Updated last week
- ☆93Updated last year
- ☆250Updated this week
- Triton-based implementation of Sparse Mixture of Experts.☆248Updated last month
- PyTorch bindings for CUTLASS grouped GEMM.☆130Updated 5 months ago
- Collection of kernels written in Triton language☆167Updated 7 months ago
- Applied AI experiments and examples for PyTorch☆305Updated 3 months ago
- TPU inference for vLLM, with unified JAX and PyTorch support.☆163Updated this week
- Triton-based Symmetric Memory operators and examples☆63Updated last month
- DeeperGEMM: crazy optimized version☆73Updated 6 months ago
- Cataloging released Triton kernels.☆267Updated 2 months ago
- torchcomms: a modern PyTorch communications API☆291Updated this week
- kernels, of the mega variety☆608Updated last month
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆286Updated this week
- Fast low-bit matmul kernels in Triton☆398Updated this week
- ☆148Updated 10 months ago
- An early research stage MoE load balancer based on inear programming.☆228Updated this week
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆143Updated 2 months ago
- A Quirky Assortment of CuTe Kernels☆660Updated 3 weeks ago
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…☆385Updated last week
- A lightweight design for computation-communication overlap.☆187Updated last month
- ☆316Updated last week
- ☆97Updated 7 months ago
- ☆113Updated last year