A collection of reproducible inference engine benchmarks
☆38Apr 22, 2025Updated last year
Alternatives and similar repositories for llm-ie-benchmarks
Users that are interested in llm-ie-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆33Apr 19, 2025Updated last year
- ☆36Updated this week
- Releasing the spot availability traces used in "Can't Be Late" paper.☆26Mar 31, 2024Updated 2 years ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆14Mar 30, 2024Updated 2 years ago
- TSED with Flexible Parser☆20Jan 22, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A PyTorch native library for training speculative decoding models☆93Apr 28, 2026Updated last week
- A sample pattern for running CI tests on Modal☆19Apr 12, 2025Updated last year
- ☆16Sep 9, 2023Updated 2 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 4 years ago
- TPU support for the fastai library☆13Apr 15, 2021Updated 5 years ago
- A Triton-only attention backend for vLLM☆25Mar 17, 2026Updated last month
- Evaluate your model using advanced prompt strategies☆21Jan 30, 2026Updated 3 months ago
- ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture☆27Feb 3, 2026Updated 3 months ago
- ☆28May 2, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [NeurIPS '25] GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents☆78Apr 27, 2026Updated last week
- Convert MathML to Latex for OneNote to Markdown☆13Mar 17, 2026Updated last month
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆22Feb 16, 2025Updated last year
- ☆82Mar 11, 2025Updated last year
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆43Apr 23, 2026Updated 2 weeks ago
- ☆56Apr 13, 2026Updated 3 weeks ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆27Oct 14, 2025Updated 6 months ago
- [HVEI 2018] Colorizing Color Images☆12Nov 22, 2018Updated 7 years ago
- ☆41Dec 7, 2025Updated 5 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The data processing pipeline for the Koala chatbot language model☆118Apr 6, 2023Updated 3 years ago
- Adaptive Resonance Theory models☆16May 12, 2017Updated 8 years ago
- Source code for the AI2 Reasoning Challenge (ARC) submission.☆16Dec 8, 2022Updated 3 years ago
- ☆27Aug 17, 2025Updated 8 months ago
- Single file interpreter (or naive virtual machine) for my intermediate representation. SSA support has been added.☆15Apr 27, 2016Updated 10 years ago
- [Re-implementation] FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence☆15Jun 29, 2020Updated 5 years ago
- [ECCV18] Constraint-Aware Deep Neural Network Compression☆12Sep 11, 2018Updated 7 years ago
- Comparison of Language Model Inference Engines☆242Dec 16, 2024Updated last year
- Pure Triton kernels for Qwen3.5-27B inference on NVIDIA B200☆108Feb 28, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆45May 6, 2025Updated last year
- Kubernetes Tutorial for the PS2 group meetings at UC Berkeley☆16Mar 23, 2023Updated 3 years ago
- 🕷 哇来扒一扒p站hhhh (Web scraper for pixiv)☆12Aug 13, 2018Updated 7 years ago
- ☆19Jun 11, 2024Updated last year
- Ship correct and fast LLM kernels to PyTorch☆149Jan 14, 2026Updated 3 months ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆19May 18, 2023Updated 2 years ago
- alibabacloud-aiacc-demo☆43May 4, 2023Updated 3 years ago