J-Seo/KoCommonGEN-V2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/J-Seo/KoCommonGEN-V2)

J-Seo / KoCommonGEN-V2

KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models

☆25

Alternatives and similar repositories for KoCommonGEN-V2

Users that are interested in KoCommonGEN-V2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

KU-HIAI / Ko-Gemma
View on GitHub
☆34Feb 27, 2024Updated 2 years ago
dlawjddn803 / INFO
View on GitHub
Code for the paper "You Truly Understand What I Need : Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona" which i…
☆23Apr 6, 2023Updated 3 years ago
J-Seo / Korean-CommonGen
View on GitHub
[Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
☆28Dec 9, 2022Updated 3 years ago
J-Seo / K-HALU
View on GitHub
K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
☆37Dec 30, 2025Updated 6 months ago
J-Seo / KommonGen
View on GitHub
한국어 생성 모델의 상식 추론을 위한 KommonGen 데이터셋입니다.
☆17Oct 5, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
metterian / peep-talk
View on GitHub
A Situational Conversation-Based English Education Platform
☆22Jan 16, 2026Updated 6 months ago
metterian / korean_bert_score
View on GitHub
BERT score for text generation
☆12Jan 15, 2025Updated last year
sugyeonge / Towards-diverse-QAG
View on GitHub
☆19Mar 4, 2024Updated 2 years ago
js-lee-AI / awesome-llm-agent-papers
View on GitHub
A curated, continuously updated reading list of 200+ papers on LLM agents: planning, memory, tool use, multi-agent, evaluation & safety. …
☆56Updated this week
yjoonjang / reviewsearch-skill
View on GitHub
Draft grounded rebuttals to your paper's reviews, with the experiments actually run in your workspace
☆17Jul 22, 2026Updated last week
rgop13 / GRASP
View on GitHub
Source code for the "GRASP: Guiding model with RelAtional Semantics using Prompt"
☆17Sep 7, 2023Updated 2 years ago
yjoonjang / PreRanker
View on GitHub
PreRanker: reranking tools before tool-use
☆20Apr 9, 2025Updated last year
HAE-RAE / haerae-evaluation-toolkit
View on GitHub
The most modern LLM evaluation toolkit
☆70Apr 30, 2026Updated 2 months ago
daekeun-ml / evaluate-llm-on-korean-dataset
View on GitHub
Performs benchmarking on two Korean datasets with minimal time and effort.
☆45Jan 22, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
UpstageAI / evalverse
View on GitHub
The Universe of Evaluation. All about the evaluation for LLMs.
☆235Jul 9, 2024Updated 2 years ago
gauss5930 / iDUS
View on GitHub
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆14Mar 20, 2024Updated 2 years ago
nlpai-lab / Korean-CommonGen
View on GitHub
[Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
☆11May 27, 2022Updated 4 years ago
instructkr / LogicKor
View on GitHub
한국어 언어모델 다분야 사고력 벤치마크
☆209Oct 17, 2024Updated last year
nlpai-lab / KommonGen
View on GitHub
한국어 생성 모델의 상식 추론을 위한 KommonGen 데이터셋입니다.
☆21Oct 5, 2021Updated 4 years ago
rladmstn1714 / CLIcK
View on GitHub
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
☆48Dec 23, 2024Updated last year
jooinjang / Ko-ATOMIC
View on GitHub
Korean Commonsense Knowledge Graph
☆15Dec 23, 2022Updated 3 years ago
songys / huggingface_KoreanDataset
View on GitHub
huggingface에 있는 한국어 데이터 세트
☆37Oct 10, 2024Updated last year
Atipico1 / Kor-IR
View on GitHub
Kor-IR: Korean Information Retrieval Benchmark
☆87Jul 3, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
nlpai-lab / KULLM
View on GitHub
☁️ 구름(KULLM): 고려대학교에서 개발한, 한국어에 특화된 LLM
☆588May 1, 2024Updated 2 years ago
sonsuhyune / UPEval
View on GitHub
☆15Jul 14, 2026Updated 2 weeks ago
davidkim205 / kollm_evaluation
View on GitHub
자체 구축한 한국어 평가 데이터셋을 이용한 한국어 모델 평가
☆31May 31, 2024Updated 2 years ago
wltschmrz / DGMO
View on GitHub
☆39Aug 20, 2025Updated 11 months ago
nlpai-lab / KURE
View on GitHub
KURE: 고려대학교에서 개발한, 한국어 검색에 특화된 임베딩 모델
☆225Apr 14, 2026Updated 3 months ago
LG-AI-EXAONE / KoMT-Bench
View on GitHub
Official repository for KoMT-Bench built by LG AI Research
☆73Aug 8, 2024Updated last year
kakao / FunctionChat-Bench
View on GitHub
☆120Feb 25, 2026Updated 5 months ago
Marker-Inc-Korea / AutoRAG-example-korean-embedding-benchmark
View on GitHub
AutoRAG example about benchmarking Korean embeddings.
☆46Oct 2, 2024Updated last year
sb-jang / kodialogbench
View on GitHub
Code and data for "KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark" (LREC-COLING…
☆18Apr 15, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jaromirsalamon / Awesome-Dialogue-System-Papers
View on GitHub
💬A curated list of incredible amount of publications related to Dialogue Systems especially Chatbots and Chit-chat Systems
☆10Dec 5, 2019Updated 6 years ago
overfit-brothers / KRX-2024
View on GitHub
☆12Dec 20, 2024Updated last year
UpstageAI / dataverse
View on GitHub
The Universe of Data. All about data, data science, and data engineering
☆563Jul 18, 2024Updated 2 years ago
corca-ai / evaluating-gpt-4o-on-CLIcK
View on GitHub
Evaluate gpt-4o on CLIcK (Korean NLP Dataset)
☆20May 18, 2024Updated 2 years ago
nlpai-lab / MIRAGE
View on GitHub
MIRAGE is a light benchmark to evaluate RAG performance.
☆37May 18, 2025Updated last year
KisuYang / EmotionX-KU
View on GitHub
BERT-Max based Contextual Emotion Classifier
☆35Oct 12, 2025Updated 9 months ago
HeegyuKim / open-korean-instructions
View on GitHub
언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.
☆469Apr 13, 2025Updated last year