OpenStellarTeam / ChineseSimpleQALinks

☆74

Alternatives and similar repositories for ChineseSimpleQA

Users that are interested in ChineseSimpleQA are comparing it to the libraries listed below

Sorting:

PALIN2018 / BrowseComp-ZH
☆119Updated 5 months ago
FlagOpen / Infinity-Instruct
☆49Updated last year
thu-coai / CritiqueLLM
☆147Updated last year
SuperGPQA / SuperGPQA
☆169Updated 5 months ago
tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆180Updated 4 months ago
PKU-Baichuan-MLSystemLab / PAS
☆54Updated last year
SkyworkAI / Skywork-Reward-V2
Scaling Preference Data Curation via Human-AI Synergy
☆116Updated 3 months ago
zexuanqiu / CLongEval
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
☆44Updated last year
a-m-team / a-m-models
a-m-team's exploration in large language modeling
☆189Updated 4 months ago
OFA-Sys / InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
☆278Updated 2 years ago
RUCAIBox / SimpleDeepSearcher
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
☆108Updated 4 months ago
yuleiqin / fantastic-data-engineering
Fantastic Data Engineering for Large Language Models
☆91Updated 9 months ago
X-PLUG / WritingBench
WritingBench: A Comprehensive Benchmark for Generative Writing
☆124Updated last month
CASIA-LM / ChineseWebText
☆179Updated last year
CASIA-LM / MoDS
☆145Updated last year
OpenMOSS / HalluQA
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆135Updated last year
nick7nlp / Counting-Stars
Counting-Stars (★)
☆83Updated 4 months ago
thu-coai / ComplexBench
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆95Updated 8 months ago
IronBeliever / CaR
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
☆89Updated 11 months ago
RUC-GSAI / Llama-3-SynE
Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …
☆34Updated 4 months ago
mtbench101 / mt-bench-101
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
☆120Updated last year
meowpass / FollowComplexInstruction
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆51Updated last year
sail-sg / regmix
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
☆173Updated 8 months ago
ADaM-BJTU / OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
☆152Updated 10 months ago
TemporaryLoRA / Temp-LoRA
☆118Updated last year
LaVi-Lab / CLEVA
[EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"
☆62Updated 5 months ago
QwenLM / AutoIF
☆312Updated last year
chenchen0103 / ACEBench
☆129Updated this week
InternLM / POLAR
Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.
☆158Updated last month
pldlgb / nuggets
☆83Updated last year