Weyaxi / scrape-open-llm-leaderboardView external linksLinks
Scrape and export data from the Open LLM Leaderboard.
☆48Dec 17, 2024Updated last year
Alternatives and similar repositories for scrape-open-llm-leaderboard
Users that are interested in scrape-open-llm-leaderboard are comparing it to the libraries listed below
Sorting:
- The backend behind the LLM-Perf Leaderboard☆11May 5, 2024Updated last year
- Cluster paraphrases by word sense☆12Jan 3, 2019Updated 7 years ago
- Knowledge Graph based Question Answering benchmark.☆10Feb 1, 2020Updated 6 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- Find informative examples to efficiently (human)-evaluate NLG models.☆18Feb 9, 2026Updated last week
- a small demo repo to show how I got neuralbeagle14-7b running locally on my 8GB GPU☆14Jan 29, 2024Updated 2 years ago
- N/A☆18Aug 15, 2022Updated 3 years ago
- ☆17Dec 21, 2023Updated 2 years ago
- Repository for the CODAH dataset☆22Oct 29, 2022Updated 3 years ago
- Evaluating LLMs with fewer examples☆169Apr 12, 2024Updated last year
- ☆32Jul 5, 2024Updated last year
- Logic grid puzzle ("zebra puzzle") generator and solver☆30Mar 1, 2024Updated last year
- ☆65Aug 7, 2023Updated 2 years ago
- Fact checking baseline combining dense retrieval and textual entailment☆30Aug 10, 2025Updated 6 months ago
- ☆27Mar 13, 2024Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 4 months ago
- TyDiP Multilingual Politeness dataset and code☆12Oct 15, 2023Updated 2 years ago
- Reasoning over Multiple Sentences (Multi-RC)☆34May 20, 2020Updated 5 years ago
- A template code for running modular and reproducible experiments in pytorch☆13Sep 3, 2025Updated 5 months ago
- ☆12Sep 22, 2015Updated 10 years ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated 11 months ago
- Fragments-Expert is a software package for feature extraction from file fragments and classification among various file formats.☆13Jan 16, 2024Updated 2 years ago
- Teaching a humanoid to walk(ish), then displaying in your browser (using tensorflow.js and reinforcement learning)☆10Sep 7, 2020Updated 5 years ago
- ☆17Updated this week
- COMET for African languages☆10Jan 24, 2025Updated last year
- A repository aimed at sharing links to climate-related resources.☆12Feb 4, 2026Updated 2 weeks ago
- Python wrapper for the energy system optimization framework IESopt.☆18Feb 9, 2026Updated last week
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- Code for Massive-scale Decoding for Text Generation using Lattices☆44Jul 29, 2022Updated 3 years ago
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10May 6, 2024Updated last year
- lime-ner: extending LIME for Named Entity Recognition☆10Aug 15, 2018Updated 7 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- ☆18Jul 3, 2025Updated 7 months ago
- ☆11Oct 15, 2022Updated 3 years ago
- ☆11Jan 3, 2024Updated 2 years ago
- Repo collects Homework code for DSCI552/INF552 @USC 20Fall Semester.☆14Nov 27, 2020Updated 5 years ago
- Enhancing Sentence Embedding with Generalized Pooling☆11Jul 26, 2018Updated 7 years ago