Psycoy / MixEval

The official evaluation suite and dynamic data release for MixEval.

☆224

Related projects ⓘ

Alternatives and complementary repositories for MixEval

allenai / WildBench
Benchmarking LLMs with Challenging Tasks from Real Users
☆195Updated 2 weeks ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆236Updated 4 months ago
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆180Updated 3 weeks ago
WildEval / ZeroEval
A simple unified framework for evaluating LLMs
☆145Updated last week
allenai / reward-bench
RewardBench: the first evaluation tool for reward models.
☆431Updated 3 weeks ago
FranxYao / Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆438Updated 8 months ago
arcee-ai / DistillKit
An Open Source Toolkit For LLM Distillation
☆356Updated 2 months ago
tianyi-lab / Reflection_Tuning
[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
☆339Updated 2 months ago
huggingface / cosmopedia
☆451Updated 3 weeks ago
xfactlab / orpo
Official repository for ORPO
☆421Updated 5 months ago
dwzhu-pku / LongEmbed
LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)
☆115Updated last week
Kipok / NeMo-Skills
A pipeline to improve skills of large language models
☆191Updated this week
da03 / Internalize_CoT_Step_by_Step
☆102Updated last month
TIGER-AI-Lab / MAmmoTH2
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆124Updated 3 weeks ago
microsoft / FILM
Official repo for "Make Your LLM Fully Utilize the Context"
☆242Updated 6 months ago
deepseek-ai / ESFT
Expert Specialized Fine-Tuning
☆145Updated last month
SALT-NLP / demonstrated-feedback
☆112Updated last month
davanstrien / awesome-synthetic-datasets
awesome synthetic (text) datasets
☆242Updated 3 weeks ago
architsharma97 / dpo-rlaif
☆90Updated 4 months ago
OpenBMB / Eurus
☆287Updated 2 months ago
Pints-AI / 1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
☆262Updated last month
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆105Updated last month
writer / writing-in-the-margins
☆105Updated 2 months ago
magpie-align / magpie
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality s…
☆491Updated 2 weeks ago
wuhy68 / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆129Updated 2 months ago
microsoft / rho
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
☆307Updated 7 months ago
lm-sys / llm-decontaminator
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆293Updated 11 months ago
arcee-ai / PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
☆196Updated 6 months ago
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆199Updated 6 months ago
Liyan06 / MiniCheck
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]
☆103Updated last month