☆21Jun 27, 2024Updated last year
Alternatives and similar repositories for GameBench
Users that are interested in GameBench are comparing it to the libraries listed below
Sorting:
- Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard☆23Dec 14, 2024Updated last year
- The official implementation of InfoRM [NeurIPS 2024].☆15Oct 25, 2025Updated 4 months ago
- Experimental LLM interface exploring new ways to use AI to improve human thinking☆19Feb 24, 2026Updated last week
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 6 months ago
- Conversational chatbot to answer questions about AI Safety & Alignment based on information retrieved from the Alignment Research Dataset☆15Feb 18, 2026Updated last week
- ☆79Nov 19, 2024Updated last year
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models☆32Nov 27, 2025Updated 3 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- Rescuing Wikipedia articles from deletion☆35Oct 9, 2025Updated 4 months ago
- Implementations of Curious Replay for model-based adaptation.☆43Jul 5, 2023Updated 2 years ago
- Repository for the paper: "Using deep learning to predict outcomes of legal appeals better than human experts"☆10Aug 1, 2022Updated 3 years ago
- Tools and models for estimating Filecoin energy use from on-chain proofs☆11Jun 14, 2024Updated last year
- A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.☆46Jan 16, 2025Updated last year
- Active Inference & Category Theory☆10Mar 11, 2024Updated last year
- ☆14Jun 7, 2023Updated 2 years ago
- Conceptual Construct Representations☆11Feb 23, 2023Updated 3 years ago
- ☆47Mar 25, 2025Updated 11 months ago
- simulate linkstate algorithm for routing☆10Nov 6, 2023Updated 2 years ago
- ☆10Mar 19, 2024Updated last year
- PyTorch code for the Neurips 2021 paper: Fairness via Representation Neutralization☆10Oct 26, 2021Updated 4 years ago
- SentiStorm - Real-time Twitter Sentiment Classification based on Apache Storm☆10May 22, 2018Updated 7 years ago
- Corpus to accompany: "Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding"☆11Apr 11, 2025Updated 10 months ago
- ☆11Oct 11, 2023Updated 2 years ago
- ☆15Dec 2, 2025Updated 3 months ago
- LensVM specifications and ABI definition☆12Apr 10, 2021Updated 4 years ago
- Turing machine ZKVM☆10Nov 12, 2023Updated 2 years ago
- Supplement of the ICFP'22 paper "‘do’ Unchained: Embracing Local Imperativity in a Purely Functional Language"☆14Feb 15, 2025Updated last year
- ☆12Jun 24, 2021Updated 4 years ago
- Testground: SDK for developing test plans in Go☆12May 24, 2024Updated last year
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- ☆11Feb 23, 2026Updated last week
- ☆10Sep 26, 2024Updated last year
- cpp_string☆13May 16, 2022Updated 3 years ago
- Bayesian scaling laws for in-context learning.☆15Mar 12, 2025Updated 11 months ago
- xlvector's solution of github contest☆33Aug 30, 2009Updated 16 years ago
- ☆13Feb 18, 2026Updated last week
- ☆11Mar 13, 2023Updated 2 years ago
- An sampe web based email provider that provides first class BrowserID support.☆69Jan 30, 2014Updated 12 years ago
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆10Feb 7, 2026Updated 3 weeks ago