LLM benchmarks
☆13Feb 22, 2024Updated 2 years ago
Alternatives and similar repositories for llm-benchmarks
Users that are interested in llm-benchmarks are comparing it to the libraries listed below
Sorting:
- Your AI assistant in the terminal.☆23Nov 22, 2024Updated last year
- The evaluation framework for the InfiCoder-Eval benchmark.☆21Jul 22, 2024Updated last year
- 東方BGM on VGS for iOS☆14May 15, 2021Updated 4 years ago
- A Node.Js / Neo4J tool that translates words and relations into network graphs and shows you how it all connects.☆11Oct 24, 2019Updated 6 years ago
- Base mech☆40Updated this week
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- 供大学生,竞赛生,高中生查找的math-wiki☆10May 26, 2022Updated 3 years ago
- ☆12Jan 11, 2026Updated last month
- HK mod adding a customizable HP bar for all enemies and bosses.☆14Mar 1, 2022Updated 4 years ago
- vue2企业网站☆10Feb 6, 2017Updated 9 years ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- A utility to measure weight using the Wii Balance Board.☆12Feb 20, 2025Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- Yet another galgame☆37Jan 6, 2026Updated 2 months ago
- ☆11Aug 9, 2022Updated 3 years ago
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10May 6, 2024Updated last year
- The frontend of Penguin Statistics' embed widget. Provides embeddable statistical data to external websites.☆11Apr 13, 2023Updated 2 years ago
- A Subjective Web-of-Trust for Decentralized Moderation and Peer Review☆13Jan 20, 2024Updated 2 years ago
- Elixir中文学习文档 https://elixir-development.github.io/ElixirDocs☆10Nov 17, 2019Updated 6 years ago
- ☆11Nov 5, 2024Updated last year
- 提醒您按时吃药的 Telegram Bot。☆15Feb 8, 2023Updated 3 years ago
- Slop Scoring to Stop Slop☆48Updated this week
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- xtlsoft's technical blog.☆10Aug 3, 2023Updated 2 years ago
- ☆12Nov 5, 2024Updated last year
- Challenge, Rethink, Ascend☆20Feb 13, 2026Updated 3 weeks ago
- ☆12Mar 5, 2025Updated last year
- A V5 brain emulator that can run most .v5python programs☆11Jan 27, 2023Updated 3 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- Code and Data for GlitchBench☆13Feb 27, 2024Updated 2 years ago
- ☆19Apr 25, 2025Updated 10 months ago
- LLM red teaming datasets from the paper 'Student-Teacher Prompting for Red Teaming to Improve Guardrails' for the ART of Safety Workshop …☆22Oct 12, 2023Updated 2 years ago
- ☆32Jan 20, 2016Updated 10 years ago
- A Lean 4 representation of the Rubik's Cube, some proofs about the representation, and a simple solution algorithm.☆11Jul 16, 2025Updated 7 months ago
- An erc-20 token with EIP750 style zk private transactions and a new nullifier scheme to allow partial spends☆11Apr 11, 2025Updated 10 months ago
- How to write an academic paper☆11Oct 20, 2022Updated 3 years ago
- 用Webpack从零搭建一个React的运行环境,并且优化打包策略☆10Aug 26, 2022Updated 3 years ago
- tools for creating computer-generated, corpus-driven graded readers☆25May 18, 2020Updated 5 years ago