Stability-AI/lm-evaluation-harness

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Stability-AI/lm-evaluation-harness)

Stability-AI / lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

☆153

Alternatives and similar repositories for lm-evaluation-harness

Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yahoojapan / JGLUE
View on GitHub
JGLUE: Japanese General Language Understanding Evaluation
☆346Mar 31, 2025Updated last year
nobu-g / JGLUE-evaluation-scripts
View on GitHub
Training and evaluation scripts for JGLUE, a Japanese language understanding benchmark
☆18Updated this week
verypluming / JSICK
View on GitHub
Repository for JSICK
☆46May 31, 2023Updated 3 years ago
HojiChar / HojiChar
View on GitHub
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
☆128Updated this week
wandb / llm-leaderboard
View on GitHub
Project of llm evaluation to Japanese tasks
☆94Jul 15, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ku-nlp / ja-vicuna-qa-benchmark
View on GitHub
☆33Jul 31, 2024Updated last year
megagonlabs / instruction_ja
View on GitHub
Japanese instruction data (日本語指示データ)
☆24Jul 13, 2023Updated 3 years ago
shi3z / alpaca_ja
View on GitHub
alpacaデータセットを日本語化したものです
☆86Jun 3, 2023Updated 3 years ago
llm-jp / llm-jp-sft
View on GitHub
☆62Jun 13, 2024Updated 2 years ago
llm-jp / llm-jp-eval
View on GitHub
☆164Updated this week
DaisukeBekki / JSeM
View on GitHub
Japanese semantic test suite (FraCaS counterpart and extensions)
☆13Apr 21, 2026Updated 3 months ago
lighttransport / japanese-llama-experiment
View on GitHub
Japanese LLaMa experiment
☆54Dec 27, 2025Updated 6 months ago
turingmotors / vlm-recipes
View on GitHub
☆20Aug 28, 2024Updated last year
Stability-AI / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆52Jul 5, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hppRC / simple-simcse-ja
View on GitHub
Exploring Japanese SimCSE
☆69Oct 31, 2023Updated 2 years ago
kunishou / oasst1-89k-ja
View on GitHub
☆16Nov 19, 2023Updated 2 years ago
kunishou / do-not-answer-ja
View on GitHub
☆24Dec 15, 2023Updated 2 years ago
llm-jp / awesome-japanese-llm
View on GitHub
日本語LLMまとめ - Overview of Japanese LLMs
☆1,421Jun 28, 2026Updated 3 weeks ago
nu-dialogue / jmultiwoz
View on GitHub
JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset, LREC-COLING 2024
☆25Mar 27, 2024Updated 2 years ago
osekilab / JCoLA
View on GitHub
☆19Apr 21, 2026Updated 3 months ago
inspection-ai / japanese-toxic-dataset
View on GitHub
☆22Jan 11, 2023Updated 3 years ago
taishi-i / awesome-japanese-nlp-resources
View on GitHub
A curated list of resources for Japanese natural language processing (NLP): Python libraries, LLMs, dictionaries, corpora, and datasets. …
☆995Updated this week
turingmotors / heron
View on GitHub
Heron is a library that seamlessly integrates multiple Vision and Language models, as well as Video and Language models.
☆177Jun 13, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
kunishou / databricks-dolly-15k-ja
View on GitHub
☆89Jul 25, 2023Updated 2 years ago
hiroshi-matsuda-rit / NLP2024-tutorial-3
View on GitHub
NLP2024 チュートリアル３作って学ぶ日本語大規模言語モデル - 環境構築手順とソースコード / NLP2024 Tutorial 3: Practicing how to build a Japanese large-scale language model - E…
☆113Apr 2, 2024Updated 2 years ago
llm-jp / llm-jp-corpus
View on GitHub
☆47Feb 2, 2024Updated 2 years ago
AUGMXNT / shisa
View on GitHub
☆43Mar 30, 2024Updated 2 years ago
yuzu-ai / japanese-llm-ranking
View on GitHub
☆50Apr 10, 2024Updated 2 years ago
iwiwi / epochraft-hf-fsdp
View on GitHub
Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP
☆11Jan 29, 2024Updated 2 years ago
sbintuitions / JMTEB
View on GitHub
The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)
☆93Mar 16, 2026Updated 4 months ago
hppRC / llm-translator
View on GitHub
Mixtral-based Ja-En (En-Ja) Translation model
☆20Jan 6, 2025Updated last year
masa3141 / japanese-alpaca-lora
View on GitHub
A japanese finetuned instruction LLaMA
☆128Mar 20, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
aiishii / JEMHopQA
View on GitHub
☆30Apr 10, 2025Updated last year
SkelterLabsInc / JaQuAD
View on GitHub
JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension (2022, Skelter Labs)
☆110Mar 2, 2022Updated 4 years ago
hppRC / llm-lora-classification
View on GitHub
LLMとLoRAを用いたテキスト分類
☆98Jul 22, 2023Updated 3 years ago
WorksApplications / SudachiTra
View on GitHub
Japanese tokenizer for Transformers
☆80Dec 15, 2023Updated 2 years ago
singletongue / wikipedia-utils
View on GitHub
Utility scripts for preprocessing Wikipedia texts for NLP
☆78Apr 9, 2024Updated 2 years ago
webbigdata-jp / JTransBench
View on GitHub
A tool to easily benchmark Japanese translation skills
☆13Oct 11, 2025Updated 9 months ago
pfnet-research / pfgen-bench
View on GitHub
Preferred Generation Benchmark
☆102Mar 6, 2026Updated 4 months ago