awslabs / robustqa-acl23
☆15Updated 10 months ago
Alternatives and similar repositories for robustqa-acl23:
Users that are interested in robustqa-acl23 are comparing it to the libraries listed below
- ☆37Updated 6 months ago
- ☆66Updated last year
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆23Updated 2 years ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated 2 months ago
- ☆14Updated 4 months ago
- Code and dataset for the emnlp paper titled Instruct and Extract: Instruction Tuning for On-Demand Information Extraction☆49Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated 2 weeks ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆40Updated 3 months ago
- Official codebase for permutation self-consistency.☆16Updated last year
- ☆19Updated 3 months ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 8 months ago
- ☆27Updated 3 months ago
- ☆17Updated 6 months ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆36Updated 2 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆42Updated 7 months ago
- Embedding Recycling for Language models☆38Updated last year
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆30Updated last year
- Retrieval Augmented Generation Generalized Evaluation Dataset☆51Updated 2 months ago
- Retrieval-Augmented Generation battle!☆49Updated 2 months ago
- ☆14Updated 8 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆42Updated 3 weeks ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆54Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 4 months ago
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆45Updated last year
- This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.☆20Updated 11 months ago
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆38Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Updated last year
- [ACL 2023] Few-shot Reranking for Multi-hop QA via Language Model Prompting☆27Updated last year