skypilot-org / skypilot-catalog
☆23Updated this week
Alternatives and similar repositories for skypilot-catalog:
Users that are interested in skypilot-catalog are comparing it to the libraries listed below
- Tutorial to get started with SkyPilot!☆57Updated 11 months ago
- ☆45Updated 9 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆41Updated this week
- A collection of reproducible inference engine benchmarks☆24Updated this week
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆66Updated this week
- ❓Curie: Automated and Rigorous Scientific Experimentation with AI Agents☆77Updated this week
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆115Updated 4 months ago
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆61Updated 3 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆26Updated this week
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 4 months ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated last week
- Make triton easier☆47Updated 10 months ago
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆56Updated last year
- PyTorch centric eager mode debugger☆47Updated 4 months ago
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆138Updated last year
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆64Updated 4 months ago
- LLM Serving Performance Evaluation Harness☆77Updated 2 months ago
- Load compute kernels from the Hub☆115Updated this week
- ☆43Updated last year
- ☆28Updated last year
- ☆30Updated 2 years ago
- The backend behind the LLM-Perf Leaderboard☆10Updated 11 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆86Updated this week
- A minimal implementation of vllm.☆39Updated 8 months ago
- ☆11Updated 2 months ago
- ML/DL Math and Method notes☆60Updated last year
- ☆21Updated last month
- A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL☆19Updated last week
- ☆15Updated 3 weeks ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆154Updated 7 months ago