A collection of reproducible inference engine benchmarks
☆38Apr 22, 2025Updated 11 months ago
Alternatives and similar repositories for llm-ie-benchmarks
Users that are interested in llm-ie-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆32Apr 19, 2025Updated last year
- ☆32Updated this week
- Releasing the spot availability traces used in "Can't Be Late" paper.☆26Mar 31, 2024Updated 2 years ago
- A PyTorch native library for training speculative decoding models☆76Apr 11, 2026Updated last week
- Tutorial to get started with SkyPilot!☆58May 15, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆34Nov 11, 2025Updated 5 months ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 4 years ago
- A Triton-only attention backend for vLLM☆25Mar 17, 2026Updated last month
- Evaluate your model using advanced prompt strategies☆21Jan 30, 2026Updated 2 months ago
- ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture☆26Feb 3, 2026Updated 2 months ago
- ☆28May 2, 2023Updated 2 years ago
- [NeurIPS '25] GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents☆76Mar 16, 2026Updated last month
- LLM-Inference-Bench☆61Jul 18, 2025Updated 9 months ago
- An experimental distributed execution engine☆23Jul 23, 2020Updated 5 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Convert MathML to Latex for OneNote to Markdown☆13Mar 17, 2026Updated last month
- Implementation of the Adaptive Resonance Theory (ART) architectures - Fuzzy ART and Fuzzy ARTMAP - for pattern recognition☆11Jan 6, 2019Updated 7 years ago
- The elegant integration of huggingface/nlp and fastai2 and handy transforms using pure huggingface/nlp☆19Oct 6, 2020Updated 5 years ago
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆22Feb 16, 2025Updated last year
- Codebase for HYPHEN, accepted at ACL 2022 (main)☆11May 17, 2022Updated 3 years ago
- Console for Kamaji, the Kubernetes Control Plane Manager☆16Oct 30, 2025Updated 5 months ago
- A powerful and user-friendly tool that generates detailed captions for your images☆21Nov 11, 2024Updated last year
- ☆52Updated this week
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆27Oct 14, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆41Dec 7, 2025Updated 4 months ago
- Adaptive Resonance Theory models☆16May 12, 2017Updated 8 years ago
- Effective transpose on Hopper GPU☆28Sep 6, 2025Updated 7 months ago
- Source code for the AI2 Reasoning Challenge (ARC) submission.☆16Dec 8, 2022Updated 3 years ago
- Single file interpreter (or naive virtual machine) for my intermediate representation. SSA support has been added.☆15Apr 27, 2016Updated 9 years ago
- Pure Triton kernels for Qwen3.5-27B inference on NVIDIA B200☆98Feb 28, 2026Updated last month
- Inference code for LLaMA models☆21Apr 3, 2025Updated last year
- Comparison of Language Model Inference Engines☆242Dec 16, 2024Updated last year
- The offline version of acm-compiler-judge☆13May 16, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Training tiny models to prove hard theorems☆72Mar 5, 2026Updated last month
- Kubernetes Tutorial for the PS2 group meetings at UC Berkeley☆16Mar 23, 2023Updated 3 years ago
- Ship correct and fast LLM kernels to PyTorch☆148Jan 14, 2026Updated 3 months ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆19May 18, 2023Updated 2 years ago
- alibabacloud-aiacc-demo☆43May 4, 2023Updated 2 years ago
- 不到100行代码实现一个Python迷你内网穿透、反向正向代理小工具☆12May 27, 2023Updated 2 years ago
- Course material for "Numerical Methods for Data Science" (SJTU, summer 2018)☆40Jul 6, 2018Updated 7 years ago