jungyeul/korean-parallel-corpora

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jungyeul/korean-parallel-corpora)

jungyeul / korean-parallel-corpora

Korean Parallel Corpus

☆147

Alternatives and similar repositories for korean-parallel-corpora

Users that are interested in korean-parallel-corpora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EleutherAI / hae-rae
View on GitHub
☆33Aug 30, 2023Updated 2 years ago
hanjanghoon / NLP_Koeran_DP
View on GitHub
2019 국어경진대회 한국어 의존구문 분석 대상(문체부 장관상)
☆16Oct 26, 2022Updated 3 years ago
jeongukjae / namuwiki-corpus
View on GitHub
문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.
☆19Jun 16, 2021Updated 5 years ago
MrBananaHuman / KoGPT2ForParaphrasing
View on GitHub
TEMP
☆34Apr 2, 2020Updated 6 years ago
kakaobrain / kortok
View on GitHub
The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)
☆119Oct 8, 2020Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
songys / Question_pair
View on GitHub
#Paired Question
☆24Jun 16, 2020Updated 6 years ago
haven-jeon / ko_en_neural_machine_translation
View on GitHub
Korean English NMT(Neural Machine Translation) with Gluon
☆61Feb 28, 2018Updated 8 years ago
naver / nlp-challenge
View on GitHub
NLP Shared tasks (NER, SRL) using NSML
☆184Jan 3, 2019Updated 7 years ago
kmounlp / NER
View on GitHub
한국어 개체명 정의 및 표지 표준화 기술보고서와 이를 기반으로 제작된 개체명 형태소 말뭉치
☆94Jan 25, 2021Updated 5 years ago
emorynlp / ud-korean
View on GitHub
Universal Dependency Treebanks in Korean
☆39Dec 19, 2021Updated 4 years ago
Huffon / nlp-startups
View on GitHub
국내 자연어 처리 기술을 연구 및 개발하는 스타트업 목록
☆163May 10, 2020Updated 6 years ago
kakaobrain / kor-nlu-datasets
View on GitHub
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding
☆314Jul 9, 2023Updated 3 years ago
cynthia / kosentences
View on GitHub
Large scale unannotated Korean corpus for unsupervised tasks. (e.g. Language modeling)
☆28Aug 11, 2019Updated 6 years ago
UniversalDependencies / UD_Korean-Kaist
View on GitHub
Data from KAIST (a Korean treebank).
☆19May 6, 2026Updated 2 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
SKT-AI / KoBART
View on GitHub
Korean BART
☆468Jun 14, 2025Updated last year
coolengineer / sejong-corpus
View on GitHub
Korean sejong corpus download and simple analysis
☆151May 9, 2019Updated 7 years ago
kocohub / korean-hate-speech
View on GitHub
Korean HateSpeech Dataset
☆398Jul 18, 2020Updated 6 years ago
Beomi / KcBERT
View on GitHub
🤗 Pretrained BERT model & WordPiece tokenizer trained on Korean Comments 한국어 댓글로 프리트레이닝한 BERT 모델과 데이터셋
☆494Nov 7, 2022Updated 3 years ago
kaniblu / hangul-utils
View on GitHub
An integrated library for Korean language preprocessing.
☆205Apr 23, 2023Updated 3 years ago
monologg / KoBERT-Transformers
View on GitHub
KoBERT on 🤗 Huggingface Transformers 🤗 (with Bug Fixed)
☆211Aug 21, 2024Updated last year
ZIZUN / korean-malicious-comments-dataset
View on GitHub
한국어 악성댓글 데이터셋
☆73Sep 26, 2020Updated 5 years ago
monologg / ko_lm_dataformat
View on GitHub
A utility for storing and reading files for Korean LM training 💾
☆35Jul 18, 2026Updated last week
warnikchow / paraKQC
View on GitHub
Parallel dataset of Korean Questions and Commands
☆60Mar 24, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ko-nlp / Korpora
View on GitHub
Korean corpus repository
☆757Oct 3, 2022Updated 3 years ago
lovit / namuwikitext
View on GitHub
Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)
☆53Oct 25, 2020Updated 5 years ago
KLUE-benchmark / KLUE
View on GitHub
📖 Korean NLU Benchmark
☆602Jun 30, 2026Updated 3 weeks ago
seopbo / nlp_classification
View on GitHub
Implementing nlp papers relevant to classification with PyTorch, gluonnlp
☆229Dec 8, 2022Updated 3 years ago
smilegate-ai / HuLiC
View on GitHub
☆93Mar 3, 2022Updated 4 years ago
e9t / nsmc
View on GitHub
Naver sentiment movie corpus
☆603Mar 7, 2017Updated 9 years ago
Kyubyong / KoParadigm
View on GitHub
KoParadigm: Korean Inflectional Paradigm Generator
☆60Nov 23, 2022Updated 3 years ago
likejazz / korean-sentence-splitter
View on GitHub
Split Korean text into sentences using heuristic algorithm.
☆216Dec 24, 2020Updated 5 years ago
monologg / KoELECTRA
View on GitHub
Pretrained ELECTRA Model for Korean
☆637Feb 19, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
lovit / kowikitext
View on GitHub
☆19Jan 17, 2021Updated 5 years ago
lovit / KoBERTScore
View on GitHub
BERTScore for Korean
☆81Feb 22, 2024Updated 2 years ago
bab2min / corpus
View on GitHub
개인적으로 수집한 한국어 NLP용 말뭉치 모음
☆140Sep 15, 2020Updated 5 years ago
j-min / korean-parallel-corpora
View on GitHub
Korean Parallel Corpus
☆11Nov 27, 2014Updated 11 years ago
theeluwin / sci-news-sum-kr-50
View on GitHub
네이버 뉴스 중 IT/과학 분야에서 50개를 선정해서 요약에 해당하는 문장을 태깅해둔 데이터셋입니다.
☆40Nov 23, 2016Updated 9 years ago
monologg / DistilKoBERT
View on GitHub
Distillation of KoBERT from SKTBrain (Lightweight KoBERT)
☆200Sep 6, 2023Updated 2 years ago
jeongukjae / korean-wikipedia-corpus
View on GitHub
문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.
☆24Sep 6, 2023Updated 2 years ago