microsoft / Multilingual-Evaluation-of-Generative-AI-MEGA
Code for Multilingual Eval of Generative AI paper published at EMNLP 2023
☆62Updated 6 months ago
Related projects: ⓘ
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆118Updated 6 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆96Updated 5 months ago
- A Multilingual Replicable Instruction-Following Model☆91Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆89Updated last year
- ☆65Updated last year
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆85Updated last month
- Multilingual Large Language Models Evaluation Benchmark☆91Updated 3 weeks ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆77Updated last month
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆69Updated 6 months ago
- Token-level Reference-free Hallucination Detection☆92Updated last year
- ☆73Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆91Updated last year
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.☆68Updated last month
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆174Updated last week
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆71Updated 2 years ago
- Detect hallucinated tokens for conditional sequence generation.☆63Updated 2 years ago
- Tools for managing datasets for governance and training.☆77Updated last month
- ☆97Updated 2 years ago
- ☆94Updated last year
- ☆160Updated last year
- Benchmarking library for RAG☆87Updated this week
- ☆37Updated last year
- Do Multilingual Language Models Think Better in English?☆41Updated last year
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆102Updated last year
- Dense hybrid representations for text retrieval☆60Updated last year
- Code for Zero-Shot Tokenizer Transfer☆109Updated 2 months ago
- ☆56Updated 7 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆63Updated last week
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks☆62Updated 2 years ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆143Updated 2 months ago