EleutherAI / lm-evaluation-harness
A framework for few-shot evaluation of language models.
☆6,990Updated this week
Related projects ⓘ
Alternatives and complementary repositories for lm-evaluation-harness
- Accessible large language models via k-bit quantization for PyTorch.☆6,299Updated this week
- Tools for merging pretrained large language models.☆4,816Updated 2 weeks ago
- Train transformer language models with reinforcement learning.☆10,086Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,497Updated last month
- SGLang is a fast serving framework for large language models and vision language models.☆6,127Updated this week
- Large Language Model Text Generation Inference☆9,122Updated this week
- General technology for enabling AI capabilities w/ LLMs and MLLMs☆3,699Updated last month
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,059Updated 5 months ago
- Fast and memory-efficient exact attention☆14,279Updated this week
- Aligning pretrained language models with instruction data generated by themselves.☆4,164Updated last year
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆16,471Updated this week
- Ongoing research training transformer models at scale☆10,595Updated this week
- Go ahead and axolotl questions☆7,930Updated this week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,502Updated 10 months ago
- PyTorch native finetuning library☆4,336Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆30,423Updated this week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆2,526Updated last month
- Robust recipes to align language models with human and AI preferences☆4,680Updated last month
- Transformer related optimization, including BERT, GPT☆5,890Updated 7 months ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…☆5,994Updated 2 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,205Updated this week
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,282Updated 2 weeks ago
- Retrieval and Retrieval-augmented LLMs☆7,613Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆4,669Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆7,919Updated 6 months ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"☆10,776Updated 3 months ago
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:☆1,765Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆7,261Updated this week
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.☆1,904Updated this week