dmahan93 / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆16Updated last year
Alternatives and similar repositories for lm-evaluation-harness:
Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- ☆20Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 6 months ago
- ☆24Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- ☆48Updated last year
- ☆48Updated 5 months ago
- ☆20Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆168Updated last year
- ☆43Updated 2 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Tools for formatting large language model prompts.☆13Updated last year
- ☆32Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year
- ☆32Updated last year
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated last month
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChain☆43Updated last year
- Set of scripts to finetune LLMs☆37Updated last year
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Updated 8 months ago
- ☆66Updated 11 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 7 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 6 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆76Updated 6 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- ☆1Updated 9 months ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated 11 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆34Updated last year