A utility for storing and reading files for Korean LM training ๐พ
โ35Oct 15, 2025Updated 4 months ago
Alternatives and similar repositories for ko_lm_dataformat
Users that are interested in ko_lm_dataformat are comparing it to the libraries listed below
Sorting:
- Korean Named Entity Corpusโ25May 12, 2023Updated 2 years ago
- ๐ฆ ํ์ด์ฌ ํ๊ธ ์ฒ๋ฆฌ ๋ผ์ด๋ธ๋ฌ๋ฆฌ. Python Korean Morphological Analyzerโ19Feb 4, 2025Updated last year
- ๋ชจ๋์ ๋ง๋ญ์น ๋ฐ์ดํฐ๋ฅผ ๋ถ์์ ํธ๋ฆฌํ ํํ๋ก ๋ณํํ๋ ๊ธฐ๋ฅ์ ์ ๊ณตํฉ๋๋ค.โ11Mar 2, 2022Updated 4 years ago
- โ11Oct 3, 2021Updated 4 years ago
- Korean large emotion labeled dataset (EmoNSMC)โ14Mar 5, 2020Updated 5 years ago
- huggingface๋ฅผ ์ด์ฉํ์ฌ downstream task ์ํํ๊ธฐโ62Dec 28, 2021Updated 4 years ago
- This is project for korean auto spacingโ12Aug 3, 2020Updated 5 years ago
- Yet another python binding for mecab-koโ88May 16, 2023Updated 2 years ago
- MeCab model trained with OpenKorPos.โ23Jun 19, 2022Updated 3 years ago
- ELECTRA๊ธฐ๋ฐ ํ๊ตญ์ด ๋ํ์ฒด ์ธ์ด๋ชจ๋ธโ53Aug 4, 2021Updated 4 years ago
- Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasetsโ130Nov 12, 2022Updated 3 years ago
- Convenient Text-to-Text Training for Transformersโ19Dec 10, 2021Updated 4 years ago
- kogpt๋ฅผ oslo๋ก ํ์ธํ๋ํ๋ ์์ .โ23Aug 26, 2022Updated 3 years ago
- baikal.ai's pre-trained BERT models: descriptions and sample codesโ12Jun 24, 2021Updated 4 years ago
- CareCall for Seniors: Role Specified Open-Domain Dialogue dataset generated by leveraging LLMs (NAACL 2022).โ60May 3, 2022Updated 3 years ago
- Finetuning Pipelineโ89Feb 25, 2022Updated 4 years ago
- KoGPT2 on Huggingface Transformersโ33May 4, 2021Updated 4 years ago
- ๐ฆ Pretrained BigBird Model for Korean (up to 4096 tokens)โ201Dec 28, 2023Updated 2 years ago
- KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorchโ15Feb 13, 2022Updated 4 years ago
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)โ119Oct 8, 2020Updated 5 years ago
- Machine Generated Captions for Best Artworksโ22Sep 21, 2022Updated 3 years ago
- Character-level Korean ELECTRA Model (์์ ๋จ์ ํ๊ตญ์ด ELECTRA)โ54Jun 12, 2023Updated 2 years ago
- ๋งค์ฃผ ๋ชฉ์์ผ, 20:00 ๋ชจ์โ16Jul 24, 2020Updated 5 years ago
- [HCLT 2022] Korean sentence text similarity dataset using naver shopping reviewโ25Oct 20, 2022Updated 3 years ago
- Korean-English Bilingual Electra Modelsโ110Nov 22, 2021Updated 4 years ago
- KLUE ๋ฐ์ดํฐ๋ฅผ ํ์ฉํ HuggingFace Transformers ํํ ๋ฆฌ์ผโ129Jun 28, 2021Updated 4 years ago
- T5-base model for Koreanโ27May 20, 2021Updated 4 years ago
- KOLD: Korean Offensive Language Datasetโ81Nov 13, 2022Updated 3 years ago
- Standalone Nori (Korean Morphological Analyzer)โ42Sep 20, 2023Updated 2 years ago
- Adversarial Test Dataset for Korean Multi-turn Response Selectionโ34Dec 16, 2021Updated 4 years ago
- Korean Online That-gul Emotions Datasetโ130Jun 24, 2023Updated 2 years ago
- Korean Nested Named Entity Corpusโ20May 13, 2023Updated 2 years ago
- โ14Dec 9, 2021Updated 4 years ago
- bpe based korean t5 model for text-to-text unified frameworkโ63Apr 17, 2024Updated last year
- โ19Jan 29, 2023Updated 3 years ago
- [Unofficial] Kakaotrans: Kakao translate API for pythonโ16Mar 29, 2020Updated 5 years ago
- ํ๊ตญ์ด ๋ฌธ์์ ๋ ธ์ด์ฆ๋ฅผ ์ถ๊ฐํฉ๋๋ค.โ27Nov 9, 2022Updated 3 years ago
- ๋ฌธ์ฅ๋จ์๋ก ๋ถ์ ๋ ํ๊ตญ์ด ์ํคํผ๋์ ์ฝํผ์ค. Releases์์ ๋ค์ด๋ก๋ ๋ฐ๊ฑฐ๋ tfds-korean์ผ๋ก ์ฌ์ฉํด์ฃผ์ธ์.โ24Sep 6, 2023Updated 2 years ago
- This repo is for Korean wiki table question answering datasets described in the paper of Korean-Specific Dataset for Table Question Answeโฆโ91Oct 22, 2024Updated last year