dmahan93 / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆16Updated last year
Alternatives and similar repositories for lm-evaluation-harness:
Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 6 months ago
- ☆24Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- ☆48Updated last year
- The Next Generation Multi-Modality Superintelligence☆70Updated 4 months ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated 10 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆29Updated 4 months ago
- Routing on Random Forest (RoRF)☆100Updated 4 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆66Updated 3 months ago
- auto fine tune of models with synthetic data☆74Updated 11 months ago
- ☆30Updated last year
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆34Updated last year
- ☆48Updated 2 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆24Updated last year
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChain☆43Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆38Updated 10 months ago
- ☆57Updated last year
- ☆19Updated 5 months ago
- Simple examples using Argilla tools to build AI☆52Updated 2 months ago
- ☆37Updated last year
- Evaluating LLMs with CommonGen-Lite☆88Updated 10 months ago
- Tools for formatting large language model prompts.☆12Updated last year
- Track the progress of LLM context utilisation☆53Updated 6 months ago
- Build a Streamlit Chatbot using Langchain, ColBERT, Ragatouille, and ChromaDB☆118Updated last year
- ☆22Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated last year
- ☆65Updated 8 months ago
- Official homepage for "Self-Harmonized Chain of Thought"☆89Updated last week