Muhtasham / llm-inference-simulatorLinks
π LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.
β12Updated last week
Alternatives and similar repositories for llm-inference-simulator
Users that are interested in llm-inference-simulator are comparing it to the libraries listed below
Sorting:
- vLLM adapter for a TGIS-compatible gRPC server.β33Updated this week
- Make triton easierβ47Updated last year
- π· Build compute kernelsβ77Updated this week
- β36Updated 2 months ago
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)β79Updated 6 months ago
- Estimating hardware and cloud costs of LLMs and transformer projectsβ18Updated 3 weeks ago
- A lightweight, user-friendly data-plane for LLM training.β20Updated 2 weeks ago
- Repository for CPU Kernel Generation for LLM Inferenceβ26Updated 2 years ago
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).β25Updated this week
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundryβ42Updated last year
- A domain-specific language (DSL) based on Triton but providing higher-level abstractions.β24Updated this week
- A collection of reproducible inference engine benchmarksβ32Updated 2 months ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMsβ110Updated last year
- Open deep learning compiler stack for cpu, gpu and specialized acceleratorsβ19Updated this week
- Boosting 4-bit inference kernels with 2:4 Sparsityβ80Updated 10 months ago
- β11Updated 5 months ago
- Sentence Embedding as a Serviceβ15Updated 2 weeks ago
- β35Updated 3 weeks ago
- Repository containing the SPIN experiments on the DIBT 10k ranked promptsβ24Updated last year
- β74Updated 3 months ago
- TensorRT LLM Benchmark Configurationβ13Updated 11 months ago
- ACL 2023β39Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found heβ¦β31Updated last year
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inferenceβ66Updated 7 months ago
- Benchmarking PyTorch 2.0 different modelsβ21Updated 2 years ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterizationβ109Updated 9 months ago
- python package of rocm-smi-libβ21Updated 9 months ago
- QuIP quantizationβ54Updated last year
- Compression for Foundation Modelsβ33Updated 3 months ago
- The backend behind the LLM-Perf Leaderboardβ10Updated last year