Muhtasham / llm-inference-simulatorLinks
🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.
☆13Updated 6 months ago
Alternatives and similar repositories for llm-inference-simulator
Users that are interested in llm-inference-simulator are comparing it to the libraries listed below
Sorting:
- vLLM adapter for a TGIS-compatible gRPC server.☆50Updated last week
- ☆31Updated 9 months ago
- ☆71Updated 10 months ago
- ☆47Updated 9 months ago
- A collection of reproducible inference engine benchmarks☆38Updated 9 months ago
- [WIP] Better (FP8) attention for Hopper☆32Updated 11 months ago
- Repository for CPU Kernel Generation for LLM Inference☆28Updated 2 years ago
- Open Source Projects from Pallas Lab☆20Updated 4 years ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆18Updated last year
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆64Updated last year
- ☆102Updated last year
- QuIP quantization☆61Updated last year
- A curated list for Efficient Large Language Models☆11Updated last year
- ☆47Updated last year
- Benchmark suite for LLMs from Fireworks.ai☆89Updated this week
- The driver for LMCache core to run in vLLM☆60Updated last year
- The backend behind the LLM-Perf Leaderboard☆11Updated last year
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆141Updated last year
- Easy, Fast, and Scalable Multimodal AI☆109Updated this week
- Estimating hardware and cloud costs of LLMs and transformer projects☆20Updated 3 weeks ago
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- Accepted to MLSys 2026☆70Updated last week
- ☆61Updated 2 years ago
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)☆80Updated last year
- python package of rocm-smi-lib☆24Updated last month
- Make triton easier☆50Updated last year
- ☆120Updated last year
- Boosting 4-bit inference kernels with 2:4 Sparsity☆93Updated last year
- ☆79Updated last year
- LLM Serving Performance Evaluation Harness☆83Updated 11 months ago