open-compass/opencompass

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/open-compass/opencompass)

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

☆7,246

Alternatives and similar repositories for opencompass

Users that are interested in opencompass are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EleutherAI / lm-evaluation-harness
View on GitHub
A framework for few-shot evaluation of language models.
☆13,452Jul 13, 2026Updated 2 weeks ago
InternLM / InternLM
View on GitHub
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
☆7,253Oct 30, 2025Updated 9 months ago
InternLM / lmdeploy
View on GitHub
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
☆7,982Updated this week
hiyouga / LlamaFactory
View on GitHub
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
☆73,625Updated this week
InternLM / xtuner
View on GitHub
A Next-Generation Training Engine Built for Ultra-Large MoE Models
☆5,170Updated this week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,864Jul 14, 2026Updated 2 weeks ago
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,737Updated this week
modelscope / ms-swift
View on GitHub
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL…
☆15,009Updated this week
open-compass / VLMEvalKit
View on GitHub
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
☆4,311Jul 22, 2026Updated last week
modelscope / evalscope
View on GitHub
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
☆3,166Updated this week
hkust-nlp / ceval
View on GitHub
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
☆1,863Jul 27, 2025Updated last year
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,580Updated this week
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,508May 1, 2026Updated 3 months ago
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆18,953Updated this week
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
FlagOpen / FlagEmbedding
View on GitHub
Retrieval and Retrieval-augmented LLMs
☆11,999Apr 22, 2026Updated 3 months ago
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,265Updated this week
yangjianxin1 / Firefly
View on GitHub
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…
☆6,649Oct 24, 2024Updated last year
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆87,645Updated this week
QwenLM / Qwen
View on GitHub
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
☆21,500Mar 5, 2026Updated 4 months ago
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆30,970Updated this week
huggingface / peft
View on GitHub
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆21,460Updated this week
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,279Oct 16, 2024Updated last year
BradyFU / Awesome-Multimodal-Large-Language-Models
View on GitHub
Latest Advances on Multimodal Large Language Models
☆17,962Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
haonan-li / CMMLU
View on GitHub
CMMLU: Measuring massive multitask language understanding in Chinese
☆829Dec 6, 2024Updated last year
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,835Updated this week
QwenLM / Qwen-VL
View on GitHub
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
☆6,718Aug 7, 2024Updated last year
InternLM / lagent
View on GitHub
A lightweight framework for building LLM-based agents
☆2,273Updated this week
baichuan-inc / Baichuan2
View on GitHub
A series of large language models developed by Baichuan Intelligent Technology
☆4,088Nov 8, 2024Updated last year
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,956Aug 12, 2024Updated last year
QwenLM / Qwen3
View on GitHub
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
☆27,451Jan 9, 2026Updated 6 months ago
CLUEbenchmark / SuperCLUE
View on GitHub
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
☆3,296Feb 6, 2026Updated 5 months ago
deepspeedai / DeepSpeedExamples
View on GitHub
Example models using DeepSpeed
☆6,831Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
huggingface / open-r1
View on GitHub
Fully open reproduction of DeepSeek-R1
☆26,412Apr 2, 2026Updated 3 months ago
ymcui / Chinese-LLaMA-Alpaca
View on GitHub
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
☆18,945Apr 19, 2026Updated 3 months ago
InternLM / InternLM-XComposer
View on GitHub
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
☆2,922May 26, 2025Updated last year
OpenGVLab / InternVL
View on GitHub
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
☆10,111Sep 22, 2025Updated 10 months ago
THUDM / LongBench
View on GitHub
LongBench v2 and LongBench (ACL 25'&24')
☆1,215Jan 15, 2025Updated last year
yizhongw / self-instruct
View on GitHub
Aligning pretrained language models with instruction data generated by themselves.
☆4,607Mar 27, 2023Updated 3 years ago
tatsu-lab / alpaca_eval
View on GitHub
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
☆2,008Aug 9, 2025Updated 11 months ago