Scrape and export data from the Open LLM Leaderboard.
☆48Dec 17, 2024Updated last year
Alternatives and similar repositories for scrape-open-llm-leaderboard
Users that are interested in scrape-open-llm-leaderboard are comparing it to the libraries listed below
Sorting:
- The backend behind the LLM-Perf Leaderboard☆11May 5, 2024Updated last year
- A simple generate script utils using fastchat conv template for generation of Large Language Models☆21Jun 21, 2023Updated 2 years ago
- Cluster paraphrases by word sense☆12Jan 3, 2019Updated 7 years ago
- Knowledge Graph based Question Answering benchmark.☆10Feb 1, 2020Updated 6 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- ☆17Dec 21, 2023Updated 2 years ago
- Open sourced backend for Martian's LLM Inference Provider Leaderboard☆21Aug 13, 2024Updated last year
- The evaluation framework for the InfiCoder-Eval benchmark.☆21Jul 22, 2024Updated last year
- Analysing differences between scheduled and actual start times for Mark McGowan press conferences☆19Jul 8, 2021Updated 4 years ago
- Repository for the CODAH dataset☆22Oct 29, 2022Updated 3 years ago
- Evaluating LLMs with fewer examples☆169Apr 12, 2024Updated last year
- ☆65Aug 7, 2023Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 5 months ago
- TyDiP Multilingual Politeness dataset and code☆12Oct 15, 2023Updated 2 years ago
- Reasoning over Multiple Sentences (Multi-RC)☆34May 20, 2020Updated 5 years ago
- ☆12Sep 22, 2015Updated 10 years ago
- NPR Visuals' fork of Quartz' Chartbuilder tool☆23Jul 24, 2018Updated 7 years ago
- Local LLM Testing & Benchmarking for Apple Silicon☆56Feb 26, 2026Updated last week
- Fragments-Expert is a software package for feature extraction from file fragments and classification among various file formats.☆13Jan 16, 2024Updated 2 years ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- ☆15Oct 24, 2023Updated 2 years ago
- ☆13Nov 5, 2024Updated last year
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- Code for Massive-scale Decoding for Text Generation using Lattices☆44Jul 29, 2022Updated 3 years ago
- Automatically evaluate your LLMs in Google Colab☆687May 7, 2024Updated last year
- The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈☆16Updated this week
- ☆11Jan 3, 2024Updated 2 years ago
- LightRAG with Neo4j Example Project☆17May 19, 2025Updated 9 months ago
- Collatinus Python Lemmatizer☆10Jun 1, 2021Updated 4 years ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Jan 30, 2024Updated 2 years ago
- Data used in Climate Indicator Project figures and tables☆15Jun 26, 2025Updated 8 months ago
- ☆12Nov 5, 2024Updated last year
- Utils to view, curate, pseudonymize, and anonymize DICOM tags and to copy DICOM files.☆11Oct 15, 2025Updated 4 months ago
- The official evaluation suite and dynamic data release for MixEval.☆11Sep 23, 2024Updated last year
- OpenSource deployment made easy☆10Jun 13, 2015Updated 10 years ago
- Shaping Language Models with Cognitive Insights☆15Feb 29, 2024Updated 2 years ago
- ☆11Oct 11, 2023Updated 2 years ago