A package dedicated for running benchmark agreement testing
☆17Sep 18, 2025Updated 5 months ago
Alternatives and similar repositories for benchbench
Users that are interested in benchbench are comparing it to the libraries listed below
Sorting:
- ♠️TrucoBench: Qual é o melhor LLM no truco? Resultados, análises e insights estratégicos.☆19Feb 24, 2025Updated last year
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆211Feb 16, 2026Updated last week
- ☆13Oct 5, 2025Updated 4 months ago
- This repository collects lecture slides, assignments (CAs), code notebooks, reports, and reference papers used in the "Deep Generative Mo…☆17Feb 14, 2026Updated 2 weeks ago
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- my profile readme☆14Updated this week
- The AI Alliance project to define a reference stack for AI model and system evaluation, with evaluations, benchmarks, and leaderboards.☆13Feb 15, 2026Updated last week
- Esolang inspired by The Demon Girl Next Door(まちカドまぞく)☆12Apr 17, 2025Updated 10 months ago
- Reference implementation of Thin and Deep Gaussian Processes (NeurIPS 2023)☆14Nov 25, 2024Updated last year
- This project aims to convert the content of GitHub repositories into a structured, machine-readable format, enabling AI models like ChatG…☆12May 13, 2024Updated last year
- ☆10Oct 2, 2024Updated last year
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- ☆12Jul 8, 2024Updated last year
- Tools to cluster visually similar images into groups in an image dataset☆11Jul 29, 2022Updated 3 years ago
- Python package for compressing floating-point PyTorch tensors☆13Jul 22, 2024Updated last year
- 1st Place Team Crane: @aswinkumar1999 @rathull @kyolebu☆29Sep 8, 2025Updated 5 months ago
- 🤖 Implementation of Self Normalizing Networks (SNN) in PyTorch.☆12Jun 19, 2017Updated 8 years ago
- Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! 🔥🚀💻☆14Jun 15, 2024Updated last year
- A Multi-domain Benchmark for Personalized Search Evaluation☆12Sep 7, 2023Updated 2 years ago
- ☆10Apr 21, 2025Updated 10 months ago
- ☆16Jun 30, 2025Updated 8 months ago
- [COLING 2022]: CommunityLM: Probing Partisan Worldviews from Language Models☆15Jan 31, 2023Updated 3 years ago
- The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models☆12Oct 28, 2024Updated last year
- Viewer for text datasets in formats like HuggingFace, JSONL, etc.☆15Feb 25, 2025Updated last year
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 6 months ago
- beko-translateは、Apple Silicon Mac向けのCLI翻訳ツールです。PDF見開き翻訳機能も同梱してあり原文・訳 文を交互に表示できます。☆32Feb 12, 2026Updated 2 weeks ago
- A simple model for predicting soccer outcomes☆11Jul 12, 2024Updated last year
- Unofficial implementation of "Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle"☆13Jul 3, 2024Updated last year
- The course work repo for UoSurrey EEEM071 (2023 Spring)☆11May 9, 2023Updated 2 years ago
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆13May 28, 2025Updated 9 months ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- Predicting the Stock Market - Can we do it?☆10Jul 24, 2021Updated 4 years ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- ☆15Aug 19, 2025Updated 6 months ago
- A Java-based framework for combinatorial test input generation, fault characterization and automated test execution.☆11Jan 22, 2024Updated 2 years ago
- The official evaluation suite and dynamic data release for MixEval.☆11Sep 23, 2024Updated last year
- ☆14May 21, 2024Updated last year