Awesome AI Benchmarks
☆29Jan 16, 2026Updated 3 months ago
Alternatives and similar repositories for awesome-ai-benchmarks
Users that are interested in awesome-ai-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM caching proxy server that emulates popular LLMs with the ability to simulate failures☆75Aug 4, 2025Updated 8 months ago
- Code repository for CISO agent as part of ITBench☆20May 8, 2025Updated 11 months ago
- Experiments on using ChatGPT for failure mode classification☆12Sep 20, 2023Updated 2 years ago
- Use an appropriate mix of LLMs based on https://nuenki.app/blog research to translate languages better than any one tool.☆27Jun 23, 2025Updated 9 months ago
- A simple, easy-to-customize pipeline for local RAG evaluation. Starter prompts and metric definitions included.☆25Jan 14, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Visually select, search, and copy your code into your clipboard for LLM context.☆26May 18, 2025Updated 11 months ago
- ☆22Feb 28, 2025Updated last year
- FailureSensorIQ, a dataset and benchmark to probe LLMs’ reasoning and comprehension of sensor–failure relationships in industrial systems…☆36Apr 10, 2026Updated last week
- [AAAI'25] The implementation of paper "Federated Foundation Models on Heterogeneous Time Series" | The first work to explore time series …☆22Feb 2, 2026Updated 2 months ago
- Analyze Reddit posts☆30Feb 27, 2025Updated last year
- MLflow deployment plugin For IBM-cloud-watson-ml☆15May 7, 2025Updated 11 months ago
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 4 months ago
- A tool for adding function calling to llm api, available as a service by following the link☆22Aug 11, 2025Updated 8 months ago
- ACPBench: Reasoning about Action, Change, and Planning. A benchmark designed to evaluate the fundamental reasoning abilities in the dom…☆33Feb 11, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Efficient and readable change point detection package implemented in Python. (Singular Spectrum Transformation - SST, IKA-SST, ulSIF, RuL…☆35Mar 14, 2026Updated last month
- Code repository for SRE agent as part of ITBench☆19Sep 9, 2025Updated 7 months ago
- Lightning fast code searching made easy☆18Jul 20, 2024Updated last year
- Emulating SAMSUNG HM641JI HDD firmware using Unicorn☆11Sep 19, 2022Updated 3 years ago
- [KDD 2025] AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation☆33Nov 18, 2025Updated 5 months ago
- A bit-array manipulation library in C☆11Oct 29, 2021Updated 4 years ago
- ChatGPT CSS style☆14Apr 28, 2024Updated last year
- The accompany backend for PAI app☆12Mar 24, 2025Updated last year
- ☆12May 30, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PoC of injecting code into a running Linux process☆23Sep 11, 2019Updated 6 years ago
- A bot that provides Youtube vid chapters on Twitter (a.k.a. X )☆12Feb 5, 2025Updated last year
- In-Situ Evaluator: Real-Time Subsample Analysis☆15Jan 25, 2026Updated 2 months ago
- Wallaby create-react-app TypeScript☆11Jan 3, 2023Updated 3 years ago
- Better Encrypted Datastore is a library for securely storing encrypted data inside Datastore. In addition, the library extends Datastore'…☆13Mar 23, 2025Updated last year
- llm-eval-simple is a simple LLM evaluation framework with intermediate actions and prompt pattern selection☆65Feb 28, 2026Updated last month
- Experiments with compile-time metaprogramming☆11Dec 29, 2025Updated 3 months ago
- ☆16Feb 1, 2025Updated last year
- Wrapper around Ghidra's analyzeHeadless script☆13Feb 5, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A web site for uploading and sharing solutions for the game TIS-100.☆13Jun 22, 2025Updated 9 months ago
- ☆19Jun 11, 2025Updated 10 months ago
- A toy for exploring arbitrary MAP rules (life-like rules, isotropic rules and so on)☆15Dec 11, 2025Updated 4 months ago
- A rust FTP client implementation.☆14Jul 13, 2024Updated last year
- AdaLLM is an NVFP4-first inference runtime for Ada Lovelace (RTX 4090) with FP8 KV cache and custom decode kernels. This repo targets NVF…☆109Feb 15, 2026Updated 2 months ago
- USB HID Monitor control utility☆13Dec 17, 2025Updated 4 months ago
- Android app for the Hole in your Palm project, making LLMs accessible on-device!☆18May 3, 2024Updated last year