Open-LLM-Leaderboard: Open-Style Question Evaluation. Paper at https://arxiv.org/abs/2406.07545
☆50Jun 27, 2024Updated last year
Alternatives and similar repositories for Open-LLM-Leaderboard
Users that are interested in Open-LLM-Leaderboard are comparing it to the libraries listed below
Sorting:
- [ICLR 2026] Optimization-free Dataset Distillation for Object Detection. Paper at: https://arxiv.org/abs/2506.01942☆22Jan 26, 2026Updated last month
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- [ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang☆16May 4, 2023Updated 2 years ago
- ☆52Jul 31, 2024Updated last year
- ☆27Jul 11, 2024Updated last year
- ☆15Nov 11, 2025Updated 3 months ago
- [AAAI 2024] SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research☆30Aug 6, 2024Updated last year
- TRAM: Benchmarking Temporal Reasoning for Large Language Models (Findings of ACL 2024)☆26Jun 21, 2024Updated last year
- AQUA dataset and VIKING model for the task of Art Visual Question Answering☆27Jun 4, 2021Updated 4 years ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆61Nov 26, 2023Updated 2 years ago
- Solutions for the book "Speech and Language Processing" (3rd ed. draft) by Dan Jurafsky and James H. Martin☆29May 19, 2022Updated 3 years ago
- ☆32Apr 18, 2021Updated 4 years ago
- ☆61Jul 7, 2025Updated 7 months ago
- This the implementation of LeCo☆31Jan 20, 2025Updated last year
- [ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives☆39Sep 9, 2025Updated 5 months ago
- ESP32 port of the existing TeslaBMS program☆10Jun 12, 2023Updated 2 years ago
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆13Aug 17, 2023Updated 2 years ago
- Evaluation of neuro-symbolic engines☆41Aug 3, 2024Updated last year
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆42Oct 28, 2025Updated 4 months ago
- Evaluating LLMs with fewer examples☆169Apr 12, 2024Updated last year
- The official code of TACL 2021, "Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies".☆85Oct 31, 2022Updated 3 years ago
- Documentation sources for syslog-ng Open Source Edition (https://github.com/syslog-ng/syslog-ng)☆10May 6, 2024Updated last year
- ☆12Jan 11, 2026Updated last month
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- Blazing fast, modular, next gen logagent☆11Updated this week
- ☆14Feb 5, 2025Updated last year
- ☆13Aug 28, 2024Updated last year
- Fake NEWS detector using LIAR dataset.☆11Aug 19, 2019Updated 6 years ago
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- Code and data for the Walert large language model-based chatbot☆12Aug 14, 2025Updated 6 months ago
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆39Jul 19, 2024Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- A General Quantum Software☆17Updated this week
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- Containerfile for the Vanilla OS Desktop+Nvidia image.☆16Feb 5, 2026Updated 3 weeks ago
- ☆44Nov 17, 2024Updated last year
- [ICML 2021] This is the official github repo for training L_inf dist nets with high certified accuracy.☆42Mar 16, 2022Updated 3 years ago