jeongukjae / korean-wikipedia-corpusView external linksLinks
문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.
☆24Sep 6, 2023Updated 2 years ago
Alternatives and similar repositories for korean-wikipedia-corpus
Users that are interested in korean-wikipedia-corpus are comparing it to the libraries listed below
Sorting:
- ☆11Oct 3, 2021Updated 4 years ago
- Korean Relation Extraction Gold Standard☆35May 31, 2021Updated 4 years ago
- ⛩ All about Korean Transformers (information and tutorial)☆19Jun 21, 2022Updated 3 years ago
- This is project for korean auto spacing☆12Aug 3, 2020Updated 5 years ago
- 2019 국어경진대회 한국어 의존구문 분석 대상(문체부 장관상)☆15Oct 26, 2022Updated 3 years ago
- Character-level Korean ELECTRA Model (음절 단위 한국어 ELECTRA)☆54Jun 12, 2023Updated 2 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆11May 27, 2022Updated 3 years ago
- ☆33Aug 30, 2023Updated 2 years ago
- I hope to this list will contribute good influence in Korean online services.☆63Feb 10, 2019Updated 7 years ago
- 매주 목요일, 20:00 모임☆16Jul 24, 2020Updated 5 years ago
- KoGPT2 on Huggingface Transformers☆33May 4, 2021Updated 4 years ago
- A utility for storing and reading files for Korean LM training 💾☆35Oct 15, 2025Updated 4 months ago
- Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)☆53Oct 25, 2020Updated 5 years ago
- BERTScore for Korean☆80Feb 22, 2024Updated last year
- Korean morphological analyzer☆28Dec 22, 2019Updated 6 years ago
- ☆25Oct 28, 2020Updated 5 years ago
- 세종 말뭉치 데이터를 정제하기 위한 utils☆37Sep 30, 2019Updated 6 years ago
- 한국어 생성 모델의 상식 추론을 위한 KommonGen 데이터셋입니다.☆17Oct 5, 2021Updated 4 years ago
- ☆92Mar 3, 2022Updated 3 years ago
- Simple setup for personal dotfiles☆11Nov 29, 2025Updated 2 months ago
- ☆11Aug 12, 2020Updated 5 years ago
- Large scale unannotated Korean corpus for unsupervised tasks. (e.g. Language modeling)☆28Aug 11, 2019Updated 6 years ago
- Korean text data preprocess toolkit for NLP☆18Jun 11, 2019Updated 6 years ago
- ☆19Jan 17, 2021Updated 5 years ago
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)☆119Oct 8, 2020Updated 5 years ago
- Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)☆31Feb 19, 2021Updated 4 years ago
- bpe based korean t5 model for text-to-text unified framework☆63Apr 17, 2024Updated last year
- Korean Speech to English Translation Corpus☆45Sep 3, 2021Updated 4 years ago
- 문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.☆19Jun 16, 2021Updated 4 years ago
- KoSentenceBERT 모델 구조 변경으로 성능 향상☆10Nov 22, 2020Updated 5 years ago
- Open Korean NLP Dataset Curation for the Users All Around the Globe☆152Nov 18, 2023Updated 2 years ago
- MeCab model trained with OpenKorPos.☆23Jun 19, 2022Updated 3 years ago
- 🇰🇷 Text to Image in Korean☆85Jan 18, 2022Updated 4 years ago
- Korean-English Bilingual Electra Models☆110Nov 22, 2021Updated 4 years ago
- Machine Generated Captions for Best Artworks☆22Sep 21, 2022Updated 3 years ago
- Korean Named Entity Corpus☆25May 12, 2023Updated 2 years ago
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- ELECTRA기반 한국어 대화체 언어모델☆53Aug 4, 2021Updated 4 years ago
- ☆39Mar 25, 2024Updated last year