A collection of reproducible inference engine benchmarks
☆38Apr 22, 2025Updated 11 months ago
Alternatives and similar repositories for llm-ie-benchmarks
Users that are interested in llm-ie-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆32Apr 19, 2025Updated 11 months ago
- Releasing the spot availability traces used in "Can't Be Late" paper.☆25Mar 31, 2024Updated last year
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆14Mar 30, 2024Updated 2 years ago
- Tutorial to get started with SkyPilot!☆59May 15, 2024Updated last year
- ☆34Nov 11, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- TPU support for the fastai library☆13Apr 15, 2021Updated 4 years ago
- A Triton-only attention backend for vLLM☆24Mar 17, 2026Updated last week
- A context-aware embedding similarity score☆11Aug 23, 2023Updated 2 years ago
- ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture☆24Feb 3, 2026Updated last month
- Few-shot text classification with meta learning and BERT☆11Jun 14, 2021Updated 4 years ago
- See how HTTPX, Requests, and AIOHTTP libraries compare for sending network requests and find out which one may fit your case better.☆21Sep 25, 2025Updated 6 months ago
- [NeurIPS '25] GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents☆74Mar 16, 2026Updated last week
- ☆28May 2, 2023Updated 2 years ago
- LLM-Inference-Bench☆60Jul 18, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Convert MathML to Latex for OneNote to Markdown☆12Mar 17, 2026Updated last week
- Implementation of the Adaptive Resonance Theory (ART) architectures - Fuzzy ART and Fuzzy ARTMAP - for pattern recognition☆11Jan 6, 2019Updated 7 years ago
- ☆12Apr 18, 2019Updated 6 years ago
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆22Feb 16, 2025Updated last year
- ☆81Mar 11, 2025Updated last year
- Pure Triton kernels for Qwen3.5-27B inference on NVIDIA B200☆83Feb 28, 2026Updated last month
- A powerful and user-friendly tool that generates detailed captions for your images☆21Nov 11, 2024Updated last year
- ☆51Feb 27, 2026Updated last month
- ☆14Jan 11, 2022Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Training tiny models to prove hard theorems☆64Mar 5, 2026Updated 3 weeks ago
- ☆41Dec 7, 2025Updated 3 months ago
- Effective transpose on Hopper GPU☆28Sep 6, 2025Updated 6 months ago
- Source code for the AI2 Reasoning Challenge (ARC) submission.☆16Dec 8, 2022Updated 3 years ago
- [Re-implementation] FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence☆15Jun 29, 2020Updated 5 years ago
- Comparison of Language Model Inference Engines☆241Dec 16, 2024Updated last year
- Inference code for LLaMA models☆21Apr 3, 2025Updated 11 months ago
- ☆44May 6, 2025Updated 10 months ago
- Building a Deep Learning Powered Emoji Slackbot!☆16Jul 23, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Winning 2nd place🥈at NUS CS5228 in-class Kaggle competition 2018!☆13Nov 13, 2018Updated 7 years ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆23Nov 28, 2021Updated 4 years ago
- Kubernetes Tutorial for the PS2 group meetings at UC Berkeley☆16Mar 23, 2023Updated 3 years ago
- ☆19Jul 12, 2025Updated 8 months ago
- 《计算模型导引》第五章参考答案☆14Jun 18, 2019Updated 6 years ago
- ☆19Jun 11, 2024Updated last year
- Ship correct and fast LLM kernels to PyTorch☆147Jan 14, 2026Updated 2 months ago