dmahan93/lm-evaluation-harness

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dmahan93/lm-evaluation-harness)

dmahan93 / lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

☆16

Alternatives and similar repositories for lm-evaluation-harness

Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mlabonne / tinytuner
View on GitHub
🐜🔧 A minimalistic tool to fine-tune your LLMs
☆19Aug 17, 2023Updated 2 years ago
SebastianBodza / EnsembleForecasting
View on GitHub
Using multiple LLMs for ensemble Forecasting
☆16Jan 17, 2024Updated 2 years ago
gauss5930 / iDUS
View on GitHub
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆14Mar 20, 2024Updated 2 years ago
samchaineau / llm_slerp_generation
View on GitHub
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆37Oct 9, 2025Updated 9 months ago
fblgit / model-similarity
View on GitHub
Simple Model Similarities Analysis
☆21Feb 3, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
cloneofsimo / fim-llama-deepspeed
View on GitHub
☆33Jan 1, 2024Updated 2 years ago
ctlllll / understanding_llm_benchmarks
View on GitHub
Understanding the correlation between different LLM benchmarks
☆30Jan 11, 2024Updated 2 years ago
DEFI-COLaF / LADaS
View on GitHub
Layout Analysis Dataset with Segmonto (LADaS)
☆25May 29, 2026Updated last month
vibrantlabsai / Funtuner
View on GitHub
Supervised instruction finetuning for LLM with HF trainer and Deepspeed
☆37Jul 6, 2023Updated 3 years ago
Nicolas-Yax / PhyloLM
View on GitHub
Genetics for Language Models
☆18Jul 1, 2024Updated 2 years ago
Pleias / Various-Finetuning
View on GitHub
Set of scripts to finetune LLMs
☆38Mar 30, 2024Updated 2 years ago
johnrobinsn / redpajama
View on GitHub
Training and Inference Notebooks for the RedPajama (OpenLlama) models
☆19May 18, 2023Updated 3 years ago
jwjohns / LFM2Sloth
View on GitHub
Modular task agnostic training pipeline using LFM2 from Liquid AI with unsloth.
☆16Sep 13, 2025Updated 10 months ago
xaiguy / chippy
View on GitHub
☆13Feb 26, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
gpt4life / alpagasus
View on GitHub
Unofficial implementation of AlpaGasus
☆94Sep 23, 2023Updated 2 years ago
cstorm125 / esninja
View on GitHub
Best practices for product search in English and Thai using Elasticsearch
☆14Mar 16, 2021Updated 5 years ago
mikolajbadyl / flutter_leap_sdk
View on GitHub
A Flutter plugin for integrating Liquid AI's LEAP SDK, enabling on-device deployment of small language models in Flutter applications.
☆24Sep 3, 2025Updated 10 months ago
apple / ml-kg-mt
View on GitHub
☆23Nov 8, 2024Updated last year
yxtay / python-project-template
View on GitHub
Starter template for python projects
☆18Feb 15, 2024Updated 2 years ago
emilyng-sz / cursor-motion-analysis
View on GitHub
end-to-end solution for cursor detection and motion analysis
☆13Jan 22, 2025Updated last year
opensearch-project / opensearch-learning-to-rank-base
View on GitHub
Fork of https://github.com/o19s/elasticsearch-learning-to-rank to work with OpenSearch
☆21Updated this week
Pleias / marginalia
View on GitHub
☆67Mar 4, 2024Updated 2 years ago
xamat / blog
View on GitHub
Personal blog post set up using jekyll
☆16Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Upaya07 / NeurIPS-llm-efficiency-challenge
View on GitHub
Code for NeurIPS LLM Efficiency Challenge
☆62Apr 9, 2024Updated 2 years ago
Alignment-Lab-AI / datagen
View on GitHub
a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆32Sep 22, 2024Updated last year
henrikalbihn / gliner-as-a-service
View on GitHub
GLiNER model in a FastAPI microservice.
☆47Dec 11, 2024Updated last year
gauss5930 / AlpaGasus2-QLoRA
View on GitHub
This is AlpaGasus2-QLoRA based on LLaMA2 with AlpaGasus mechanism using QLoRA!
☆15Nov 22, 2023Updated 2 years ago
ApGa / Go-Browse
View on GitHub
Automatic, unsupervised collection of web agent training data via exploration.
☆29Oct 8, 2025Updated 9 months ago
choosewhatulike / case2code
View on GitHub
☆17Apr 7, 2025Updated last year
kookaburracodes / investor-education-chatchain
View on GitHub
Not financial advice.
☆28Mar 18, 2023Updated 3 years ago
guardrails / guardrails
View on GitHub
☆19Sep 19, 2017Updated 8 years ago
Alignment-Lab-AI / Dataset-Conversion-Toolkit
View on GitHub
a set of scripts to easily convert all training data from huggingface into alpaca instruct or sharegpt format, which should allow for eas…
☆20Mar 14, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AblateIt / finetune-study
View on GitHub
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆83Sep 10, 2023Updated 2 years ago
microsoft / Evoke
View on GitHub
Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'
☆19Dec 2, 2023Updated 2 years ago
emrgnt-cmplxty / zero-shot-replication
View on GitHub
☆75Sep 5, 2023Updated 2 years ago
kyutai-labs / moshi-webrtc
View on GitHub
Proof of concept for running moshi/hibiki using webrtc
☆21Feb 28, 2025Updated last year
acl-org / acl-2023
View on GitHub
Repository for the ACL 2023 conference website
☆11Jan 9, 2024Updated 2 years ago
TREMA-UNH / rubric-grading-workbench
View on GitHub
A Workbench for Autograding Retrieve/Generate Systems
☆15Jun 30, 2025Updated last year
openrewardstandard / python-sdk
View on GitHub
A Python SDK for Open Reward Standard servers and clients
☆17Mar 24, 2026Updated 4 months ago