NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG against first-stage retrieval errors on 18 languages.
☆26Nov 29, 2024Updated last year
Alternatives and similar repositories for nomiracl
Users that are interested in nomiracl are comparing it to the libraries listed below
Sorting:
- AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)☆18Jan 13, 2025Updated last year
- Lightblue LLM Eval Framework: tengu, elyza100, ja-mtbench, rakuda☆18Jan 6, 2026Updated 2 months ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆15May 3, 2023Updated 2 years ago
- 日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark☆38Oct 7, 2025Updated 4 months ago
- Crosslingual Reasoning through Test-Time Scaling☆19May 13, 2025Updated 9 months ago
- ☆17May 31, 2023Updated 2 years ago
- Latest version of MedEX/J (Japanese disease name extractor)☆18May 17, 2022Updated 3 years ago
- ☆20Mar 22, 2024Updated last year
- The paper list of multilingual pre-trained models (Continual Updated).☆24Jun 18, 2024Updated last year
- Swallowプロジェクト 大規模言語モデル 評価スクリプト☆24Sep 17, 2025Updated 5 months ago
- ☆30Jun 3, 2024Updated last year
- Benchmark for Japanese document embedding & vector search☆29Mar 12, 2024Updated last year
- https://pypi.org/project/intent-suggestions/☆10Sep 6, 2022Updated 3 years ago
- Code and data for "The Power of Noise: Redefining Retrieval for RAG Systems"☆69Jul 3, 2025Updated 8 months ago
- Evaluation tools shared across anserini, pyserini, and pygaggle☆35Feb 26, 2026Updated last week
- Official code for AAAI2023 paper`Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum`☆45Feb 9, 2025Updated last year
- COMET-ATOMIC ja☆31Mar 8, 2024Updated last year
- NLP 100 Exercise 2025☆40Apr 9, 2025Updated 10 months ago
- A lightweight framework for evaluating visual-language models.☆41Jan 16, 2026Updated last month
- Logical inference system based on event semantics and degree semantics in formal semantics☆11Jan 22, 2023Updated 3 years ago
- JQaRA: Japanese Question Answering with Retrieval Augmentation - 検索拡張(RAG)評価のための日本語Q&Aデータセット☆43Sep 9, 2025Updated 5 months ago
- WaPENの文法をPythonっぽくしたもの☆14Updated this week
- ☆11Jun 11, 2024Updated last year
- EANN(Pytorch)☆10Mar 12, 2022Updated 3 years ago
- FactNews is the first dataset to predict sentence-level factuality of news reporting. Furthemore, we provide baseline results for sentenc…☆11Jun 12, 2025Updated 8 months ago
- Directed masked autoencoders☆14Feb 20, 2026Updated 2 weeks ago
- web programming course (COMPSCI 326, UMass Amherst)☆14Sep 13, 2022Updated 3 years ago
- The official implementation of the paper "Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset"(ICASSP 2…☆12Feb 19, 2023Updated 3 years ago
- JSAI2019でのチュートリアル講演 「オントロジー工学に基づくセマンティック技術」の資料公開用☆12Jun 7, 2019Updated 6 years ago
- Gerador de texto treinado nas obras de João Guimarães Rosa☆11Jul 14, 2021Updated 4 years ago
- Stochastic Kronecker Generation in Python, Used in RPI TRUST☆10Dec 13, 2017Updated 8 years ago
- ☆12Oct 4, 2022Updated 3 years ago
- Tokyo Metropolitan University Paraphrase Corpus (TMUP)☆11Jun 12, 2017Updated 8 years ago
- A simple chatbot sample on chatbase☆11May 18, 2020Updated 5 years ago
- ☆11Nov 27, 2022Updated 3 years ago
- Check and warn if a Pull Request will conflict with another Pull Request when they get merged.☆10Feb 26, 2026Updated last week
- An unconventional programming language that compiles to EVM bytecode.☆15Feb 25, 2026Updated last week
- m3 techbook templete☆10May 2, 2023Updated 2 years ago
- Custom JS and CSS made by hideo54.☆10Sep 22, 2022Updated 3 years ago