manuelescobar-dev / LLM-ToolsLinks

Open-source calculator for LLM system requirements.

☆175

Alternatives and similar repositories for LLM-Tools

Users that are interested in LLM-Tools are comparing it to the libraries listed below

Sorting:

lapp0 / lm-inference-engines
Comparison of Language Model Inference Engines
☆235Updated 11 months ago
GenseeAI / cognify
Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower exe…
☆261Updated 6 months ago
inferflow / inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
☆249Updated last year
MooreThreads / TurboRAG
☆89Updated 11 months ago
vllm-project / guidellm
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
☆708Updated this week
sihyeong / Awesome-LLM-Inference-Engine
☆150Updated 5 months ago
yale-sys / prompt-cache
Modular and structured prompt caching for low-latency LLM inference
☆103Updated last year
vectorch-ai / ScaleLLM
A high-performance inference system for large language models, designed for production environments.
☆482Updated 2 weeks ago
intel / llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆131Updated 2 months ago
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆300Updated this week
thunlp / InfLLM
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Mem…
☆388Updated last year
efeslab / Nanoflow
A throughput-oriented high-performance serving framework for LLMs
☆915Updated 3 weeks ago
ninehills / llm-inference-benchmark
LLM Inference benchmark
☆430Updated last year
microsoft / MInference
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention…
☆1,152Updated last month
bentoml / llm-bench
☆56Updated last year
hkust-nlp / CodeIO
[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
☆560Updated 6 months ago
sgl-project / sgl-learning-materials
Materials for learning SGLang
☆650Updated this week
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆268Updated 3 months ago
hao-ai-lab / Dynasor
[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.
☆206Updated 5 months ago
ServerlessLLM / ServerlessLLM
Serverless LLM Serving for Everyone.
☆603Updated this week
project-etalon / etalon
LLM Serving Performance Evaluation Harness
☆80Updated 8 months ago
intel / xFasterTransformer
☆431Updated 2 months ago
ray-project / llmperf
LLMPerf is a library for validating and benchmarking LLMs
☆1,046Updated 11 months ago
triton-inference-server / vllm_backend
☆312Updated last week
backprop-ai / vllm-benchmark
Benchmarking the serving capabilities of vLLM
☆56Updated last year
metame-ai / awesome-llm-plaza
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
☆210Updated 3 weeks ago
run-ai / llmperf
☆57Updated last year
fw-ai / benchmark
Benchmark suite for LLMs from Fireworks.ai
☆83Updated last week
efeslab / fiddler
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
☆243Updated last year
terryyz / llm-benchmark
A list of LLM benchmark frameworks.
☆72Updated last year