EleutherAI / lm-evaluation-harnessLinks

A framework for few-shot evaluation of language models.

☆9,588

Alternatives and similar repositories for lm-evaluation-harness

Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below

Sorting:

bitsandbytes-foundation / bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
☆7,330Updated this week
arcee-ai / mergekit
Tools for merging pretrained large language models.
☆6,053Updated last week
AutoGPTQ / AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆4,897Updated 3 months ago
pytorch / torchtune
PyTorch native post-training library
☆5,347Updated this week
mit-han-lab / llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
☆3,154Updated this week
InternLM / lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
☆6,736Updated this week
huggingface / alignment-handbook
Robust recipes to align language models with human and AI preferences
☆5,271Updated last week
open-compass / opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …
☆5,699Updated this week
Dao-AILab / flash-attention
Fast and memory-efficient exact attention
☆18,448Updated last week
stanford-crfm / helm
Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …
☆2,344Updated this week
huggingface / trl
Train transformer language models with reinforcement learning.
☆14,675Updated this week
casper-hansen / AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
☆2,210Updated 2 months ago
ModelTC / LightLLM
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…
☆3,398Updated this week
sgl-project / sglang
SGLang is a fast serving framework for large language models and vision language models.
☆16,236Updated this week
FasterDecoding / Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
☆2,575Updated last year
OpenRLHF / OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Asy…
☆7,407Updated this week
EleutherAI / pythia
The hub for EleutherAI's work on interpretability and learning dynamics
☆2,570Updated last month
allenai / open-instruct
AllenAI's post-training codebase
☆3,067Updated this week
allenai / OLMo
Modeling, training, eval, and inference code for OLMo
☆5,810Updated last week
yizhongw / self-instruct
Aligning pretrained language models with instruction data generated by themselves.
☆4,429Updated 2 years ago
huggingface / peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆19,080Updated last week
jzhang38 / TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆8,654Updated last year
openai / transformer-debugger
☆4,087Updated last year
volcengine / verl
verl: Volcano Engine Reinforcement Learning for LLMs
☆11,168Updated this week
microsoft / LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
☆4,067Updated 3 weeks ago
pytorch-labs / gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,033Updated 3 months ago
Zjh-819 / LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
☆3,182Updated last year
NVIDIA / TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…
☆11,063Updated this week
huggingface / text-generation-inference
Large Language Model Text Generation Inference
☆10,352Updated this week
tatsu-lab / alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
☆1,806Updated 6 months ago