songys/huggingface_KoreanDataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/songys/huggingface_KoreanDataset)

songys / huggingface_KoreanDataset

huggingface에 있는 한국어 데이터 세트

☆37

Alternatives and similar repositories for huggingface_KoreanDataset

Users that are interested in huggingface_KoreanDataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HeegyuKim / open-korean-instructions
View on GitHub
언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.
☆469Apr 13, 2025Updated last year
wandb / llm-kr-eval
View on GitHub
☆20Jul 24, 2024Updated 2 years ago
human-rights-corpus / HRC
View on GitHub
#인권코퍼스
☆31Oct 6, 2023Updated 2 years ago
nlpai-lab / KURE
View on GitHub
KURE: 고려대학교에서 개발한, 한국어 검색에 특화된 임베딩 모델
☆225Apr 14, 2026Updated 3 months ago
instructkr / LogicKor
View on GitHub
한국어 언어모델 다분야 사고력 벤치마크
☆209Oct 17, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
paust-team / pko-t5
View on GitHub
bpe based korean t5 model for text-to-text unified framework
☆63Apr 17, 2024Updated 2 years ago
J-Seo / KoCommonGEN-V2
View on GitHub
KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models
☆25Aug 24, 2024Updated last year
MrBananaHuman / UnethicalQuestionsKor
View on GitHub
☆19Oct 24, 2023Updated 2 years ago
overfit-brothers / KRX-2024
View on GitHub
☆12Dec 20, 2024Updated last year
teddysum / korean_evaluation
View on GitHub
☆10Jun 5, 2025Updated last year
Atipico1 / Kor-IR
View on GitHub
Kor-IR: Korean Information Retrieval Benchmark
☆87Jul 3, 2024Updated 2 years ago
davidkim205 / kollm_evaluation
View on GitHub
자체 구축한 한국어 평가 데이터셋을 이용한 한국어 모델 평가
☆31May 31, 2024Updated 2 years ago
Marker-Inc-Korea / AutoRAG-example-korean-embedding-benchmark
View on GitHub
AutoRAG example about benchmarking Korean embeddings.
☆46Oct 2, 2024Updated last year
deepseasw / nlp_model_list
View on GitHub
최신 자연어처리 모델 소개
☆74Jul 22, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
metterian / korean_bert_score
View on GitHub
BERT score for text generation
☆12Jan 15, 2025Updated last year
rladmstn1714 / CLIcK
View on GitHub
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
☆48Dec 23, 2024Updated last year
MrBananaHuman / Voice2Facemesh
View on GitHub
☆11Aug 9, 2022Updated 3 years ago
EleutherAI / hae-rae
View on GitHub
☆33Aug 30, 2023Updated 2 years ago
LG-NLP / KorWikiTableQuestions
View on GitHub
This repo is for Korean wiki table question answering datasets described in the paper of Korean-Specific Dataset for Table Question Answe…
☆91Oct 22, 2024Updated last year
kakao / OrchestrationBench
View on GitHub
☆48Apr 17, 2026Updated 3 months ago
workdd / LLM_Foreign_Block
View on GitHub
LLM 모델의 외국어 토큰 생성을 막는 코드 구현
☆87Aug 7, 2025Updated 11 months ago
hephaex / mecab-ko
View on GitHub
MeCab-Ko: Rust로 구현된 한국어 형태소 분석기. 세종 코퍼스 호환.
☆19Jul 19, 2026Updated last week
BM-K / KoDiffCSE
View on GitHub
Difference-based Contrastive Learning for Korean Sentence Embeddings
☆23Mar 11, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
instructkr / reranker-simple-benchmark
View on GitHub
Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.
☆35Dec 2, 2025Updated 7 months ago
jooinjang / Ko-ATOMIC
View on GitHub
Korean Commonsense Knowledge Graph
☆15Dec 23, 2022Updated 3 years ago
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
kh-kim / nlp-express-practice
View on GitHub
☆10Jan 20, 2024Updated 2 years ago
ahans30 / goldfish-loss
View on GitHub
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆98Nov 17, 2024Updated last year
boychaboy / KOLD
View on GitHub
KOLD: Korean Offensive Language Dataset
☆83Nov 13, 2022Updated 3 years ago
Beomi / Gemma-EasyLM
View on GitHub
Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)
☆50Mar 2, 2024Updated 2 years ago
HAE-RAE / haerae-evaluation-toolkit
View on GitHub
The most modern LLM evaluation toolkit
☆70Apr 30, 2026Updated 2 months ago
gauss5930 / iDUS
View on GitHub
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆14Mar 20, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
MrBananaHuman / CounselGPT
View on GitHub
한국어 심리 상담 데이터셋
☆81Jun 20, 2023Updated 3 years ago
Marker-Inc-Korea / Korean-SAT-LLM-Leaderboard
View on GitHub
Korean SAT leader board
☆169Nov 20, 2025Updated 8 months ago
LG-AI-EXAONE / KMMLU-Pro
View on GitHub
☆16Aug 18, 2025Updated 11 months ago
su-park / mteb_ko_leaderboard
View on GitHub
한글 텍스트 임베딩 모델 리더보드
☆97Oct 22, 2024Updated last year
openkorpos / model-mecab
View on GitHub
MeCab model trained with OpenKorPos.
☆23Jun 19, 2022Updated 4 years ago
JoJo0217 / rlhf_korean_dataset
View on GitHub
For the rlhf learning environment of Koreans
☆25Sep 25, 2023Updated 2 years ago
lcw99 / evolve-instruct
View on GitHub
evolve llm training instruction, from english instruction to any language.
☆120Sep 15, 2023Updated 2 years ago