Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.
☆17Aug 22, 2024Updated last year
Alternatives and similar repositories for BenchmarkAggregator
Users that are interested in BenchmarkAggregator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Logical inference system based on event semantics and degree semantics in formal semantics☆10Jan 22, 2023Updated 3 years ago
- a Haskell library that implements (Projective) Discourse Representation Theory (DRT)☆27Sep 15, 2022Updated 3 years ago
- ☆46May 3, 2026Updated last month
- TPTP python library and benchmarking service☆13Oct 2, 2019Updated 6 years ago
- PANiC - PAraphrasing Noun-Compounds☆15Apr 6, 2018Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Resources accompanying the "Zero-Shot Recommendation as Language Modeling" paper (ECIR2022)☆14May 25, 2023Updated 3 years ago
- Data and all☆14Sep 30, 2019Updated 6 years ago
- Implement the essential operators from Allens Interval Algebra, and also some metaprogramming for combinatoral operators☆13Oct 17, 2018Updated 7 years ago
- [ECAI 2023] Official implementation of "FATRER: Full-Attention Topic Regularizer for Accurate and Robust Conversational Emotion Recogniti…☆13Oct 9, 2023Updated 2 years ago
- Code repository for the WWW 2019 paper "Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness"☆12Feb 1, 2019Updated 7 years ago
- An abductive reasoning engine written in C++.☆13Dec 28, 2018Updated 7 years ago
- Automated Semantic Analysis of Discourse Markers☆11May 30, 2022Updated 4 years ago
- Allows two LLMs to communicate and run code in the terminal☆28Dec 8, 2024Updated last year
- ☆13Jul 28, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆29Jul 9, 2024Updated last year
- ☆26Aug 2, 2025Updated 10 months ago
- [ACL 2024] DiFiNet: Boundary-Aware Semantic Differentiation and Filtration Network for Nested Named Entity Recognition☆17Oct 2, 2024Updated last year
- Coding-agent VM orchestrator: runs coding agents in isolated VMs — Firecracker micro-VMs on Linux (with ZFS-based audit-trail snapshots) …☆30Updated this week
- A web interactive tool for building proofs in the sequent calculus of Linear Logic, with its backend written in OCaml☆25Apr 7, 2025Updated last year
- ☆26Apr 15, 2023Updated 3 years ago
- ☆13Apr 6, 2025Updated last year
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- Create your own RVC v2 dataset from a youtube video☆31Jan 27, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A set of tools to simplify development for JavaScript based SmartTVs☆15May 11, 2026Updated last month
- ☆70Mar 24, 2026Updated 2 months ago
- an auto-sleeping and -waking framework around llama.cpp☆13Feb 8, 2025Updated last year
- ☆18Feb 23, 2025Updated last year
- ☆12May 30, 2025Updated last year
- ☆19Sep 24, 2024Updated last year
- An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API.☆18Jul 21, 2025Updated 10 months ago
- Tornado Oauth 2 client☆17Dec 20, 2022Updated 3 years ago
- Roboflow's inference server to analyze video streams. This project extracts insights from video frames at defined intervals and generates…☆11May 21, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Python library to interact with Tuya BLE protocol☆30Jul 21, 2022Updated 3 years ago
- Official code for [NeurIPS 2025] C3Po: Cross-View Cross-Modality Correspondence by Pointmap Prediction☆26Jan 27, 2026Updated 4 months ago
- This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.☆18Apr 20, 2023Updated 3 years ago
- ISS Tracker for the Cardputer Adv☆47Jan 19, 2026Updated 4 months ago
- An LLM-enchanced Infocom Experience☆23Apr 19, 2025Updated last year
- Enables AI agents to use Google Maps features (geocoding, elevation, search, directions) via the Agent-to-Agent (A2A) protocol.☆17Apr 29, 2025Updated last year
- [RA-L] SHeRLoc: Synchronized Heterogeneous Radar Place Recognition for Cross-Modal Localization☆33Nov 24, 2025Updated 6 months ago