Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.
☆17Aug 22, 2024Updated last year
Alternatives and similar repositories for BenchmarkAggregator
Users that are interested in BenchmarkAggregator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Free e-books, built from markdown☆13Jul 15, 2024Updated last year
- ☆45May 3, 2026Updated 2 weeks ago
- TPTP python library and benchmarking service☆13Oct 2, 2019Updated 6 years ago
- PANiC - PAraphrasing Noun-Compounds☆15Apr 6, 2018Updated 8 years ago
- Python library providing a simple, fully supervised sentence embedding technique for textual adversarial attacks.☆13Dec 13, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Data and all☆14Sep 30, 2019Updated 6 years ago
- Data and related code for ACL2019 paper "Implicit Discourse Relation Identification for Open-domain Dialogues"☆12Jul 29, 2019Updated 6 years ago
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.☆13Nov 17, 2020Updated 5 years ago
- An open-source NLP library: fast text cleaning and preprocessing☆23Nov 9, 2021Updated 4 years ago
- Allows two LLMs to communicate and run code in the terminal☆28Dec 8, 2024Updated last year
- GQR, a Fast Reasoner for Binary Qualitative Constraint Calculi☆20Nov 11, 2017Updated 8 years ago
- ☆18Feb 29, 2024Updated 2 years ago
- Analytic tableau based minimal model generator, model checker and theorem prover for first-order logic with modal extensions☆20Aug 22, 2025Updated 9 months ago
- Neural Unification for Logic Reasoning over Language☆22Nov 15, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆22May 4, 2024Updated 2 years ago
- Discourse Based Evaluation of Language Understanding☆21Jan 28, 2023Updated 3 years ago
- Classical CHAT80 NLP system for Prolog☆25Feb 27, 2025Updated last year
- Information and artifacts for "LoRA Learns Less and Forgets Less" (TMLR, 2024)☆21Sep 27, 2024Updated last year
- [ACL 2024] DiFiNet: Boundary-Aware Semantic Differentiation and Filtration Network for Nested Named Entity Recognition☆17Oct 2, 2024Updated last year
- stock trading by Deep Q-Learning (Deep Q Network)☆13Feb 6, 2017Updated 9 years ago
- Firecracker VM orchestration for Claude Code sessions☆29Updated this week
- This is my code from competition Google Cloud & YouTube-8M Video Understanding Challenge. My solution based on video level features only.☆16Jun 5, 2017Updated 8 years ago
- Simulator of a memory controller to connect DRAMSim and FlashDIMMSim into one unified memory☆17Apr 4, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Simple Streamlit application used for demonstrating Anthropic Claude 3 family of model's multimodal prompting on Amazon Bedrock☆17Dec 5, 2024Updated last year
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- MCP server for OpenRouter.ai integration☆62Nov 8, 2025Updated 6 months ago
- A set of tools to simplify development for JavaScript based SmartTVs☆15May 11, 2026Updated last week
- Telegram bridge for Claude Code and Codex CLI☆83Feb 26, 2026Updated 2 months ago
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- ☆12May 30, 2025Updated 11 months ago
- ☆19Sep 24, 2024Updated last year
- Automatically sync recurring Hebrew calendar events, like birthdays and anniversaries, to your digital calendar☆25Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API.☆18Jul 21, 2025Updated 10 months ago
- Use LLMs to clean your gmail inbox☆22Dec 23, 2023Updated 2 years ago
- A standalone ip -> country lookup table. Prebuilt & Redistributable.☆28Apr 29, 2025Updated last year
- Employees who do not use AI will be replaced by employees who do use AI.☆27Dec 26, 2022Updated 3 years ago
- Official code for the NeurIPS 2025 Paper: C3Po: Cross-View Cross-Modality Correspondence by Pointmap Prediction☆26Jan 27, 2026Updated 3 months ago
- Code for NIPS 2018 paper, "Chain of Reasoning for Visual Question Answering"☆28Nov 23, 2018Updated 7 years ago
- This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.☆18Apr 20, 2023Updated 3 years ago