NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG against first-stage retrieval errors on 18 languages.
☆27Nov 29, 2024Updated last year
Alternatives and similar repositories for nomiracl
Users that are interested in nomiracl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Crosslingual Reasoning through Test-Time Scaling☆20May 13, 2025Updated last year
- The paper list of multilingual pre-trained models (Continual Updated).☆24Jun 18, 2024Updated last year
- ☆28Oct 31, 2023Updated 2 years ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆15May 3, 2023Updated 3 years ago
- A repo for LLM jailbreak☆14Sep 5, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- 日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark☆40Oct 7, 2025Updated 8 months ago
- ACL 2023 (Findings) End-to-end Cross-lingual Label Project☆15Nov 24, 2023Updated 2 years ago
- ☆12Mar 1, 2025Updated last year
- ☆20Mar 22, 2024Updated 2 years ago
- AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)☆20Jan 13, 2025Updated last year
- Evaluation tools shared across anserini, pyserini, and pygaggle☆36May 16, 2026Updated 3 weeks ago
- Lightblue LLM Eval Framework: tengu, elyza100, ja-mtbench, rakuda☆18Apr 29, 2026Updated last month
- EANN(Pytorch)☆10Mar 12, 2022Updated 4 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆82Feb 16, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CAMeL Dataset☆15Apr 15, 2025Updated last year
- ☆15Oct 24, 2022Updated 3 years ago
- STREET: a multi-task and multi-step reasoning dataset☆26Feb 28, 2024Updated 2 years ago
- Tools for working with the S800 corpus☆12Sep 17, 2020Updated 5 years ago
- Sustain-LC is a benchmarking environment for traditional and reinforcement learning based controls as well as LLM based control☆35Aug 7, 2025Updated 10 months ago
- The Python Implementation of CRISP: Clustering Multi-Vector Representations for Denoising and Pruning☆27Jul 27, 2025Updated 10 months ago
- ☆21May 5, 2017Updated 9 years ago
- This repository helps you evaluate your models on the FreshStack benchmark!☆34Dec 9, 2025Updated 6 months ago
- 微博谣言检测, 前端Vue,后端Flask☆13Jul 14, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆14Sep 1, 2025Updated 9 months ago
- Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“☆15Jun 13, 2023Updated 3 years ago
- Corpus do Idioma Português e Modelos☆25Oct 2, 2017Updated 8 years ago
- ☆13Oct 4, 2022Updated 3 years ago
- Source code for SIGIR 2022 paper.☆16Apr 25, 2022Updated 4 years ago
- TopViewRS: Vision-Language Models as Top-View Spatial Reasoners (EMNLP 2024 Oral)☆15Jun 14, 2025Updated last year
- Code for "Towards Robust k-Nearest-Neighbor Machine Translation" (EMNLP 2022)☆12Oct 18, 2022Updated 3 years ago
- Dataset for Conversation Semantic Role Labeling☆11Aug 26, 2021Updated 4 years ago
- ☆15May 12, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆369May 17, 2024Updated 2 years ago
- python package for unsupervised text segmentation.☆14Oct 31, 2016Updated 9 years ago
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆28Aug 8, 2025Updated 10 months ago
- NeurIPS 2024: RAGraph: A General Retrieval-Augmented Graph Learning Framework☆23Feb 4, 2025Updated last year
- [NAACL 2025 Main Conference] PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization☆27Mar 29, 2025Updated last year
- Ultimate playbook for unmoderated UX testing☆13Jan 27, 2025Updated last year
- Evaluation of BEIR Datasets using ColBERT retrieval model☆18Mar 4, 2022Updated 4 years ago