Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.
☆17Aug 22, 2024Updated last year
Alternatives and similar repositories for BenchmarkAggregator
Users that are interested in BenchmarkAggregator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Logical inference system based on event semantics and degree semantics in formal semantics☆10Jan 22, 2023Updated 3 years ago
- GOPHI: an AMR-to-English Verbalizer☆11Feb 5, 2020Updated 6 years ago
- ☆38Jan 25, 2026Updated 2 months ago
- a Haskell library that implements (Projective) Discourse Representation Theory (DRT)☆27Sep 15, 2022Updated 3 years ago
- Python library providing a simple, fully supervised sentence embedding technique for textual adversarial attacks.☆13Dec 13, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Resources accompanying the "Zero-Shot Recommendation as Language Modeling" paper (ECIR2022)☆14May 25, 2023Updated 2 years ago
- Data and related code for ACL2019 paper "Implicit Discourse Relation Identification for Open-domain Dialogues"☆12Jul 29, 2019Updated 6 years ago
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.☆13Nov 17, 2020Updated 5 years ago
- [ECAI 2023] Official implementation of "FATRER: Full-Attention Topic Regularizer for Accurate and Robust Conversational Emotion Recogniti…☆13Oct 9, 2023Updated 2 years ago
- Automated Semantic Analysis of Discourse Markers☆11May 30, 2022Updated 3 years ago
- An open-source NLP library: fast text cleaning and preprocessing☆23Nov 9, 2021Updated 4 years ago
- Allows two LLMs to communicate and run code in the terminal☆28Dec 8, 2024Updated last year
- ☆13Jul 28, 2023Updated 2 years ago
- ☆18Feb 29, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Collects a multimodal dataset of Wikipedia articles and their images☆16Mar 25, 2023Updated 3 years ago
- Data creation, training and eval scripts for the IRCoder paper☆21May 31, 2024Updated last year
- ☆19Dec 26, 2022Updated 3 years ago
- Code for ICLR 2019 paper 'CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model'☆21May 21, 2019Updated 6 years ago
- Classical CHAT80 NLP system for Prolog☆25Feb 27, 2025Updated last year
- ☆26Aug 2, 2025Updated 8 months ago
- ☆13Apr 6, 2025Updated last year
- Firecracker VM orchestration for Claude Code sessions☆23Mar 30, 2026Updated last week
- Simple Streamlit application used for demonstrating Anthropic Claude 3 family of model's multimodal prompting on Amazon Bedrock☆17Dec 5, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- Create your own RVC v2 dataset from a youtube video☆31Jan 27, 2024Updated 2 years ago
- Python Module for Logical Validation (forked from Rob Truxler library)☆26Jul 28, 2020Updated 5 years ago
- ScribePal is an Open Source intelligent browser extension that leverages AI to empower your web experience by providing contextual insigh…☆22Updated this week
- ☆18Feb 23, 2025Updated last year
- ☆12May 30, 2025Updated 10 months ago
- 🚧 Accepting Task Submissions 🚧☆108Updated this week
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- Use LLMs to clean your gmail inbox☆21Dec 23, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Tornado Oauth 2 client☆17Dec 20, 2022Updated 3 years ago
- Roboflow's inference server to analyze video streams. This project extracts insights from video frames at defined intervals and generates…☆12May 21, 2024Updated last year
- Official code for the NeurIPS 2025 Paper: C3Po: Cross-View Cross-Modality Correspondence by Pointmap Prediction☆25Jan 27, 2026Updated 2 months ago
- Code for NIPS 2018 paper, "Chain of Reasoning for Visual Question Answering"☆28Nov 23, 2018Updated 7 years ago
- This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.☆18Apr 20, 2023Updated 2 years ago
- An LLM-enchanced Infocom Experience☆22Apr 19, 2025Updated 11 months ago
- GHOSTS dataset☆39Jul 19, 2023Updated 2 years ago