LaVi-Lab/CLEVA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LaVi-Lab/CLEVA)

LaVi-Lab / CLEVA

[EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"

☆64

Alternatives and similar repositories for CLEVA

Users that are interested in CLEVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Felixgithub2017 / CG-Eval
View on GitHub
Chinese Generation Evaluation
☆13Aug 14, 2023Updated 2 years ago
LaVi-Lab / FTTT
View on GitHub
[ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.
☆13May 16, 2025Updated last year
llmeval / LLMEval-Fair
View on GitHub
[ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines
☆40May 21, 2026Updated 2 months ago
marzenakrp / LiteraryTranslation
View on GitHub
☆24Apr 2, 2024Updated 2 years ago
LaVi-Lab / EgoMask
View on GitHub
[ICCV 2025] "Fine-grained Spatiotemporal Grounding on Egocentric Videos"
☆26Jul 3, 2026Updated 2 weeks ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
JinchaoLove / CUHK-PhD-Thesis-Template
View on GitHub
Latex template for CUHK PhD Thesis
☆14Jun 29, 2025Updated last year
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
LaVi-Lab / Rethink_CoT_Video
View on GitHub
Official code for "Rethinking Chain-of-Thought Reasoning for Videos"
☆21Dec 14, 2025Updated 7 months ago
Alpha-VLLM / WeMix-LLM
View on GitHub
☆17Oct 15, 2023Updated 2 years ago
DAMO-NLP-SG / M3Exam
View on GitHub
Data and code for paper "M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models"
☆105Jun 15, 2023Updated 3 years ago
SYSU-MUCFC-FinTech-Research-Center / ZhiLu
View on GitHub
智鹿：中文消金领域对话大模型
☆30Nov 12, 2023Updated 2 years ago
stanford-crfm / helm
View on GitHub
Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …
☆2,857Jul 1, 2026Updated 2 weeks ago
NoSyu / VHUCM
View on GitHub
Implementation of Variational Hierarchical User-based Conversation Model
☆10Jul 2, 2021Updated 5 years ago
AI45Lab / Flames
View on GitHub
Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.
☆68May 21, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
LaVi-Lab / AIM
View on GitHub
[ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"
☆65Oct 9, 2025Updated 9 months ago
Felixgithub2017 / MMCU
View on GitHub
MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING
☆90Mar 24, 2024Updated 2 years ago
NJUDeepEngine / CAEF
View on GitHub
Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"
☆11Oct 11, 2024Updated last year
krystalan / ClidSum
View on GitHub
EMNLP 2022: ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization
☆37Jan 13, 2024Updated 2 years ago
maitrix-org / de-arena
View on GitHub
Official repository for Decentralized Arena via Collective LLM Intelligence
☆18May 19, 2025Updated last year
qanastek / DrBERT
View on GitHub
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
☆22Feb 7, 2024Updated 2 years ago
Kennethborup / centered_kernel_alignment
View on GitHub
Implementation of Centered Kernel Alignment (CKA)
☆10Apr 7, 2021Updated 5 years ago
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 2 months ago
yueyang2000 / CKA_minibatch_pytorch
View on GitHub
Pytorch implementation of Centered Kernel Alignment(CKA) and its minibatch version.
☆11May 11, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
haonan-li / CMMLU
View on GitHub
CMMLU: Measuring massive multitask language understanding in Chinese
☆828Dec 6, 2024Updated last year
gkoumasd / MSAF
View on GitHub
Fusion Modality Approaches for sentiment analysis and emotion recognition task.
☆12Feb 5, 2021Updated 5 years ago
jlianglab / CAiD
View on GitHub
Official PyTorch Implementation for CAiD: Context-Aware Instance Discrimination for Self-supervised Learning in Medical Imaging - MIDL 20…
☆11Apr 15, 2022Updated 4 years ago
OpenLMLab / GAOKAO-Bench-Updates
View on GitHub
GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.
☆47Jan 7, 2025Updated last year
Alab-NII / Awesome-SciLM
View on GitHub
Pre-trained Language Model for Scientific Text
☆46Feb 22, 2024Updated 2 years ago
AI-EDU-LAB / E-EVAL
View on GitHub
Official github repo for E-Eval, a Chinese K12 education evaluation benchmark for LLMs.
☆32Feb 19, 2024Updated 2 years ago
krystalan / RAGtrans
View on GitHub
[EMNLP 2025 Findings] Retrieval-Augmented Machine Translation with Unstructured Knowledge
☆15Sep 4, 2025Updated 10 months ago
AlenUbuntu / Awesome-Vision-and-Language-PreTrain-Papers
View on GitHub
☆14Dec 25, 2020Updated 5 years ago
kevinyaobytedance / llm_eval
View on GitHub
LLM evaluation.
☆16Nov 7, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
X-PLUG / CValues
View on GitHub
面向中文大模型价值观的评估与对齐研究
☆560Jul 20, 2023Updated 3 years ago
hkust-nlp / ceval
View on GitHub
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
☆1,860Jul 27, 2025Updated 11 months ago
leolle / atec_nlp
View on GitHub
蚂蚁金融自然语言处理竞赛。
☆10Sep 3, 2018Updated 7 years ago
SXU-YaxinGuo / CRMU
View on GitHub
儿童故事常识推理与寓意理解评测（Commonsense Reasoning and Moral Understanding Evaluation in Children's Stories，CRMU）
☆18Oct 22, 2024Updated last year
flageval-baai / HalluDial
View on GitHub
☆21Aug 19, 2024Updated last year
tjunlp-lab / Awesome-LLMs-Evaluation-Papers
View on GitHub
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
☆804May 8, 2024Updated 2 years ago
seanzhang-zhichen / simcse-pytorch
View on GitHub
SimCSE
☆15Oct 1, 2022Updated 3 years ago