ServiceNow / drbenchLinks
An enterprise deep research benchmark
☆32Updated 3 months ago
Alternatives and similar repositories for drbench
Users that are interested in drbench are comparing it to the libraries listed below
Sorting:
- ☆80Updated last year
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆86Updated last year
- Repository for paper Decrypting Cryptic Crosswords☆10Updated 4 years ago
- Inquisitive Parrots for Search☆199Updated 8 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆136Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated 2 years ago
- Codebase accompanying the Summary of a Haystack paper.☆80Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆192Updated 6 months ago
- ☆29Updated 2 years ago
- Ensembling Hugging Face transformers made easy☆61Updated 3 years ago
- State-of-the-art paired encoder and decoder models (17M-1B params)☆58Updated 6 months ago
- ☆38Updated 6 months ago
- Code repository for the NAACL 2022 paper "ExSum: From Local Explanations to Model Understanding"☆64Updated 3 years ago
- ☆57Updated 2 years ago
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆218Updated 2 years ago
- ☆43Updated last year
- Scalable training for dense retrieval models.☆298Updated 7 months ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆118Updated 3 years ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆176Updated 2 weeks ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆96Updated 2 years ago
- MetaQA: Combining Expert Agents for Multi-Skill Question Answering☆23Updated 3 years ago
- A diff tool for language models☆44Updated 2 years ago
- Ranking of fine-tuned HF models as base models.☆36Updated 4 months ago
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆36Updated 8 months ago
- Fine-tune ModernBERT with custom tokenizers, curriculum learning, and next-gen optimizers.☆74Updated 3 weeks ago
- ☆13Updated 2 years ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated 2 years ago
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆62Updated 3 years ago
- ☆29Updated 11 months ago
- The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.☆182Updated 3 years ago