h2oai / h2o-LLM-evalLinks
Large-language Model Evaluation framework with Elo Leaderboard and A-B testing
☆52Updated 8 months ago
Alternatives and similar repositories for h2o-LLM-eval
Users that are interested in h2o-LLM-eval are comparing it to the libraries listed below
Sorting:
- A set of utilities for running few-shot prompting experiments on large-language models☆121Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆130Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 8 months ago
- We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in …☆54Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 9 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆77Updated 8 months ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated last year
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆66Updated 11 months ago
- Evaluating tool-augmented LLMs in conversation settings☆85Updated last year
- ☆16Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆104Updated 6 months ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆84Updated 10 months ago
- [NeurIPS 2023] PyTorch code for Can Language Models Teach? Teacher Explanations Improve Student Performance via Theory of Mind☆67Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- ☆57Updated 9 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆149Updated last year
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆162Updated last year
- ☆39Updated 11 months ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 9 months ago
- Reward Model framework for LLM RLHF☆61Updated 2 years ago
- ☆52Updated last year
- ☆94Updated 6 months ago
- SILO Language Models code repository☆81Updated last year
- Retrieval Augmented Generation Generalized Evaluation Dataset☆53Updated 7 months ago
- ☆39Updated 2 years ago
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with L…☆43Updated 2 years ago
- ☆84Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".☆105Updated 2 weeks ago