skypilot-org / skypilot-catalogLinks

☆25

Alternatives and similar repositories for skypilot-catalog

Users that are interested in skypilot-catalog are comparing it to the libraries listed below

Sorting:

gpu-mode / discord-cluster-manager
Write a fast kernel and run it on Discord. See how you compare against the best!
☆61Updated this week
CentML / DeepView.Profile
🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.
☆64Updated 9 months ago
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆37Updated 6 months ago
huggingface / kernel-builder
👷 Build compute kernels
☆171Updated last week
NVIDIA-NeMo / Automodel
Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
☆167Updated this week
meta-pytorch / torchft
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
☆452Updated this week
fw-ai / benchmark
Benchmark suite for LLMs from Fireworks.ai
☆83Updated 2 weeks ago
snowflakedb / ArcticTraining
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
☆245Updated this week
meta-pytorch / BackendBench
How to ensure correctness and ship LLM generated kernels in PyTorch
☆117Updated this week
tilde-research / MoMoE-impl
Memory optimized Mixture of Experts
☆69Updated 3 months ago
opendatahub-io / vllm-tgis-adapter
vLLM adapter for a TGIS-compatible gRPC server.
☆44Updated this week
UmerHA / triton_util
Make triton easier
☆48Updated last year
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆60Updated last year
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆130Updated 11 months ago
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆122Updated 9 months ago
project-etalon / etalon
LLM Serving Performance Evaluation Harness
☆80Updated 8 months ago
google-deepmind / asyncdiloco
☆47Updated last year
apple / ml-recurrent-drafter
☆218Updated 9 months ago
meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated last month
PrimeIntellect-ai / pccl
PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
☆138Updated 2 months ago
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆299Updated last week
Infini-AI-Lab / MagicDec
[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding
☆131Updated 11 months ago
tyler-griggs / melange-release
☆48Updated last year
gpu-mode / ring-attention
ring-attention experiments
☆155Updated last year
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆266Updated last year
cray-lm / cray-lm
Cray-LM unified training and inference stack.
☆22Updated 9 months ago
huggingface / kernels
Load compute kernels from the Hub
☆326Updated last week
facebookresearch / fastgen
Simple high-throughput inference library
☆149Updated 6 months ago
stas00 / ml-ways
ML/DL Math and Method notes
☆64Updated last year
samsja / pydantic_config
Manage ML configuration with pydantic
☆16Updated 5 months ago