nlp-waseda / JMMLUView external linksLinks
日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark
☆38Oct 7, 2025Updated 4 months ago
Alternatives and similar repositories for JMMLU
Users that are interested in JMMLU are comparing it to the libraries listed below
Sorting:
- ☆147Feb 7, 2026Updated last week
- Swallowプロジェクト 大規模言語モデル 評価スクリプト☆23Sep 17, 2025Updated 4 months ago
- ☆50Apr 10, 2024Updated last year
- ☆62Jun 13, 2024Updated last year
- ☆33Jul 31, 2024Updated last year
- Preferred Generation Benchmark☆91Oct 28, 2025Updated 3 months ago
- Japanese LLaMa experiment☆54Dec 27, 2025Updated last month
- ☆19May 23, 2024Updated last year
- ☆41Apr 10, 2025Updated 10 months ago
- JMED-LLM: Japanese Medical Evaluation Dataset for Large Language Models☆56Sep 22, 2024Updated last year
- Japanese translation of Open Source AI Definition☆26Nov 15, 2024Updated last year
- JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset, LREC-COLING 2024☆25Mar 27, 2024Updated last year
- Benchmark for Japanese document embedding & vector search☆29Mar 12, 2024Updated last year
- ☆16Mar 4, 2024Updated last year
- ☆24Dec 15, 2023Updated 2 years ago
- 最新LLMの一覧を作成します☆20Feb 1, 2026Updated 2 weeks ago
- Project of llm evaluation to Japanese tasks☆91Feb 4, 2026Updated last week
- ☆29Apr 10, 2025Updated 10 months ago
- COMET-ATOMIC ja☆31Mar 8, 2024Updated last year
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 6 months ago
- Annotated Fuman Kaitori Center Corpus☆18Dec 18, 2023Updated 2 years ago
- Japanese / English Bilingual LLM☆28Dec 23, 2025Updated last month
- A library for evaluation of Grammatical Error Correction (GEC). Accepted to ACL'25 Demo: "gec-metrics: A Unified Library for Grammatical …☆14Jan 25, 2026Updated 3 weeks ago
- Evaluation Pipeline for medical tasks.☆12Updated this week
- RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities☆63Mar 13, 2024Updated last year
- ☆15Nov 20, 2025Updated 2 months ago
- AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)☆18Jan 13, 2025Updated last year
- n-wise coverage tool for combinatorial testing☆11Sep 7, 2019Updated 6 years ago
- AI Assistance for Writing Scientific Alt Text☆14Feb 7, 2024Updated 2 years ago
- NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG against first-stage retrieval errors on 18 la…☆26Nov 29, 2024Updated last year
- ☆45Sep 6, 2025Updated 5 months ago
- Exploring Japanese SimCSE☆69Oct 31, 2023Updated 2 years ago
- Repository for JSICK☆45May 31, 2023Updated 2 years ago
- ☆43Feb 2, 2024Updated 2 years ago
- The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)☆84Jan 6, 2026Updated last month
- 2023 ABCI Llama-2 継続学習プロジェクト☆14Jan 22, 2024Updated 2 years ago
- ☆11Oct 2, 2024Updated last year
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated last year
- Python-based chat demo for TinySwallow-1.5B that works completely offline☆57Jan 29, 2025Updated last year