qcri / LLMeBench
Benchmarking Large Language Models
☆80Updated last month
Related projects ⓘ
Alternatives and complementary repositories for LLMeBench
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 8 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆97Updated 7 months ago
- ☆37Updated 4 months ago
- The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…☆27Updated last year
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆58Updated 2 years ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆37Updated 8 months ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆54Updated 6 months ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆76Updated 7 months ago
- Resources for cultural NLP research☆67Updated this week
- A Multilingual Replicable Instruction-Following Model☆94Updated last year
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆40Updated 3 years ago
- Code and dataset for the emnlp paper titled Instruct and Extract: Instruction Tuning for On-Demand Information Extraction☆50Updated 10 months ago
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆30Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆53Updated 3 months ago
- LogiTorch is a PyTorch-based library for logical reasoning on natural language☆68Updated 2 months ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆104Updated 2 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Gen…☆58Updated last year
- Token-level Reference-free Hallucination Detection☆93Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆102Updated last year
- ☆95Updated last year
- ☆38Updated 7 months ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆53Updated last year
- This project studies the performance and robustness of language models and task-adaptation methods.☆141Updated 6 months ago
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.☆70Updated 3 months ago
- Tools for managing datasets for governance and training.☆78Updated 3 weeks ago
- Apps built using Inspired Cognition's Critique.☆58Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆91Updated last year
- ☆19Updated 3 years ago
- ☆30Updated 2 years ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆66Updated 8 months ago