wgwang / awesome-LLM-benchmarksView external linksLinks
Awesome LLM Benchmarks to evaluate the LLMs across text, code, image, audio, video and more.
☆159Jan 3, 2024Updated 2 years ago
Alternatives and similar repositories for awesome-LLM-benchmarks
Users that are interested in awesome-LLM-benchmarks are comparing it to the libraries listed below
Sorting:
- 中国大模型☆6,376Nov 30, 2024Updated last year
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- Dilation Gate CNN For Machine Reading Comprehension☆17Mar 24, 2023Updated 2 years ago
- SUPERVAIZER is a toolkit built for the age of AI interoperability. At its core, it implements Google's Agent-to-Agent (A2A) protocol, ena…☆14Feb 4, 2026Updated last week
- An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-design☆22Dec 13, 2024Updated last year
- Transform feature ideas into production-ready code through systematic Spec-Driven Development 通过系统化的**规格驱动开发**,将功能想法转化为可投入生产的代码☆47Jan 8, 2026Updated last month
- Lightweight and Flexible Library for Creating Agents and Multi-Agent Conversations 🤖☆27Dec 24, 2025Updated last month
- [NAACL 2025 Main] AgentMove: A Large Language Model based Agentic Framework for Zero-shot Next Location Prediction.☆41Jul 26, 2025Updated 6 months ago
- A feature-rich concurrency kit, yet another DAG framework☆10Jan 18, 2026Updated 3 weeks ago
- Chemical Processes Instrumentation☆14Jun 3, 2023Updated 2 years ago
- 汽车行业中文大模型测评基准,基于多轮开放式问题的细粒度评测☆38Dec 26, 2023Updated 2 years ago
- SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese☆3,272Feb 6, 2026Updated last week
- Paster core module using KiteX☆10Aug 30, 2023Updated 2 years ago
- Self hosted AI workflow for scraping Instagram Reels (audio and description). Extracting, summarising and categorising, then storing all …☆27Jan 10, 2026Updated last month
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- The Chemical Reaction Optimization (CRO) algorithm with dependent classes in python 3.☆11Apr 21, 2020Updated 5 years ago
- 面试秘境前端☆14Jan 31, 2025Updated last year
- A job management system for python☆10Jan 16, 2026Updated 3 weeks ago
- DevOps实践(包含shell, yaml, python, dockerfile, etc)---->可用于快速部署环境, 和构造CI/CD流水线☆11Sep 24, 2023Updated 2 years ago
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 2 years ago
- Canopy is a machine learning learning compiler stack with the capability of adopting high-end FPGAs. As a part of OpenAIOS project, Canop…☆12May 7, 2021Updated 4 years ago
- ☆10Sep 9, 2024Updated last year
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …☆6,663Updated this week
- This is a large set of data types, extension methods, and utilities designed to make your life a little easier.☆12Updated this week
- Surf productively and block websites. Website blocker + tab manager WebExtension.☆11Jul 23, 2018Updated 7 years ago
- Useful snippets for your chrome/firefox browsers☆16Apr 10, 2022Updated 3 years ago
- 维权 指南是一个开放的知识库,试图为消费者提供一系列在维权过程中切实可行的操作指南和法律参考。☆11Jan 11, 2023Updated 3 years ago
- This is a short introduction to data analysis in Jupyter notebooks for chemical engineering students.☆11Mar 16, 2022Updated 3 years ago
- ☆16Apr 28, 2023Updated 2 years ago
- This project is targeted to detect which parking lot (actually any user defined polygon) are occupied by any object.☆11Aug 30, 2018Updated 7 years ago
- Docker base images for C++ development using vcpkg☆11Jan 27, 2026Updated 2 weeks ago
- Hyperparameter tuning with Optuna integrated tensor2tensor.☆10Oct 7, 2020Updated 5 years ago
- Simple, Non authoritative Benchmarks for embedded databases running in Github Actions☆11Jul 11, 2024Updated last year
- Agentic framework combining the power of LLMs with domain-specific tools for materials science, enabling property extraction, simulations…☆11May 1, 2025Updated 9 months ago
- A/B Test knowledge system(AB实验知识体系).☆12Sep 24, 2020Updated 5 years ago
- 使用Taro 开发的微信小程序,涉及微信登录,用户,地址,电话,下载图片到相册,等授权引导,主要功能:电商购买流程,地图标注等☆12Sep 4, 2020Updated 5 years ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Feb 15, 2024Updated last year
- Agentic translation using reflection workflow, refactored and sugared.☆11Sep 25, 2024Updated last year
- chemical master equation solver☆16May 2, 2018Updated 7 years ago