VinAIResearch / WhoQALinks
Who's Who: Large Language Models Meet Knowledge Conflicts in Practice (EMNLP 2024 Findings)
☆10Updated 5 months ago
Alternatives and similar repositories for WhoQA
Users that are interested in WhoQA are comparing it to the libraries listed below
Sorting:
- Multilingual Large Language Models Evaluation Benchmark☆127Updated 10 months ago
- ☆75Updated 6 months ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆97Updated last year
- Evaluation of the Cross-Lingual Knowledge Alignment in LLMs☆9Updated last year
- ☆182Updated 2 weeks ago
- Crosslingual Reasoning through Test-Time Scaling☆18Updated 2 months ago
- A retrieval augmented sequence modeling toolkit implemented based on Fairseq☆29Updated 2 years ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆150Updated 2 months ago
- contrastive decoding☆202Updated 2 years ago
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation☆35Updated last year
- ACL 2023: Evaluating Open-Domain Question Answering in the Era of Large Language Models☆47Updated last year
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆203Updated last year
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆25Updated 4 months ago
- [ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia☆169Updated 11 months ago
- NAACL 2024: SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning☆25Updated 4 months ago
- The evaluation code for the paper "MoreHopQA: More Than Multi-hop Reasoning"☆14Updated last year
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆114Updated 10 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆70Updated last year
- ☆242Updated last year
- ☆13Updated last year
- ☆55Updated 10 months ago
- The geometry of multilingual language model representations (EMNLP 2022).☆21Updated 2 years ago
- EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration☆36Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆59Updated last year
- ☆49Updated 4 months ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆363Updated 3 months ago
- VNHSGE: Vietnamese High School Graduation Examination Dataset for Large Language Models☆27Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆162Updated 3 weeks ago
- PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…☆275Updated 2 years ago
- [ACL 2023] Learning Multi-step Reasoning by Solving Arithmetic Tasks. https://arxiv.org/abs/2306.01707☆24Updated 2 years ago