ayulockin / llm-eval-sweepLinks
A simple repository showcasing a few LLM Evaluation strategies and leverages W&B Sweeps to optimize the LLM system.
☆12Updated 2 years ago
Alternatives and similar repositories for llm-eval-sweep
Users that are interested in llm-eval-sweep are comparing it to the libraries listed below
Sorting:
- ☆56Updated 2 years ago
- ☆20Updated 5 months ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆13Updated last year
- Clinical NLP Shared Task @ NAACL'24☆35Updated last month
- ☆54Updated 8 months ago
- ☆77Updated 8 months ago
- Implementation for EACL 2024 paper "Corpus-Steered Query Expansion with Large Language Models"☆12Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆79Updated last year
- ☆48Updated last year
- ☆19Updated last year
- ☆13Updated 8 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆97Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 11 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆78Updated 11 months ago
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testing☆52Updated 11 months ago
- [NeurIPS'22] EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records☆90Updated 10 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆42Updated 6 months ago
- Official repository of "EHR-SeqSQL : A Sequential Text-to-SQL Dataset For Interactively Exploring Electronic Health Records" (ACL 2024 Fi…☆16Updated last year
- Code for paper 'Data-Efficient FineTuning'☆28Updated 2 years ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 7 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated last year
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆55Updated 11 months ago
- ☆127Updated 11 months ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24Updated 3 years ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆117Updated this week
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆34Updated 5 months ago
- ☆20Updated last year
- Open Implementations of LLM Analyses☆107Updated 11 months ago
- LangCode - Improving alignment and reasoning of large language models (LLMs) with natural language embedded program (NLEP).☆43Updated 2 years ago