cynthia / kosentencesView external linksLinks
Large scale unannotated Korean corpus for unsupervised tasks. (e.g. Language modeling)
☆28Aug 11, 2019Updated 6 years ago
Alternatives and similar repositories for kosentences
Users that are interested in kosentences are comparing it to the libraries listed below
Sorting:
- Prosody-semantics Interface in Seoul Korean☆12Oct 9, 2020Updated 5 years ago
- 문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.☆24Sep 6, 2023Updated 2 years ago
- Korean morphological analyzer☆28Dec 22, 2019Updated 6 years ago
- 한국어 높임말 교정☆26Dec 31, 2022Updated 3 years ago
- KoParadigm: Korean Inflectional Paradigm Generator☆57Nov 23, 2022Updated 3 years ago
- Korean text data preprocess toolkit for NLP☆18Jun 11, 2019Updated 6 years ago
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)☆119Oct 8, 2020Updated 5 years ago
- KoSentenceBERT 모델 구조 변경으로 성능 향상☆10Nov 22, 2020Updated 5 years ago
- Open Korean NLP Dataset Curation for the Users All Around the Globe☆152Nov 18, 2023Updated 2 years ago
- ☆33Aug 30, 2023Updated 2 years ago
- MeCab model trained with OpenKorPos.☆23Jun 19, 2022Updated 3 years ago
- Korean ALBERT☆46Nov 11, 2019Updated 6 years ago
- Deep NLP 2 (2019.3-5)☆11Feb 19, 2019Updated 6 years ago
- 세종 구문 분석 말뭉치의 의존 구문 구조로의 변환 도구☆10Sep 7, 2018Updated 7 years ago
- A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.☆20Jun 8, 2022Updated 3 years ago
- Training Transformers of Huggingface with KoNLPy☆68Aug 28, 2020Updated 5 years ago
- ☆14Nov 19, 2020Updated 5 years ago
- 야자타임 (a.k.a. 야밤의 자연어처리 타임)☆27Mar 31, 2021Updated 4 years ago
- 한국어 문서에 노이즈를 추가합니다.☆27Nov 9, 2022Updated 3 years ago
- 매주 목요일, 20:00 모임☆16Jul 24, 2020Updated 5 years ago
- Bi-LSTM - CRF Named Entity Recognition model for Korean (Keras)☆16Feb 7, 2018Updated 8 years ago
- 사전에서 대화 예문만 추출한 데이터☆16Apr 24, 2023Updated 2 years ago
- Korean Speech to English Translation Corpus☆45Sep 3, 2021Updated 4 years ago
- Korean Relation Extraction Gold Standard☆35May 31, 2021Updated 4 years ago
- reference pytorch code for named entity tagging☆87Oct 18, 2024Updated last year
- CNN+BiLSTM 기반 한국어 개체명 인식기입니다☆57Nov 26, 2019Updated 6 years ago
- ☆15Nov 28, 2021Updated 4 years ago
- ☆23Oct 30, 2023Updated 2 years ago
- 한국어 어휘 의미 분석 모델☆21Apr 4, 2022Updated 3 years ago
- Subword-level Word Vector Representations for Korean (ACL 2018)☆107Oct 17, 2019Updated 6 years ago
- 문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.☆19Jun 16, 2021Updated 4 years ago
- Data from KAIST (a Korean treebank).☆19Nov 12, 2025Updated 3 months ago
- Simple extension of WikiExtractor(https://github.com/attardi/wikiextractor)☆16Dec 23, 2016Updated 9 years ago
- Split Korean text into sentences using heuristic algorithm.☆214Dec 24, 2020Updated 5 years ago
- 텍스트마이닝 실습을 위한 데이터셋 핸들러☆38Dec 6, 2019Updated 6 years ago
- Korean Parallel Corpus☆147Feb 24, 2024Updated last year
- 한국어 문장 띄어쓰기(삭제/추가) 모델입니다. 데이터 준비 후 직접 학습이 가능하도록 작성하였습니다.☆57Jul 11, 2022Updated 3 years ago
- KSenticNet: 한국어 감성 사전☆33May 20, 2019Updated 6 years ago
- A utility for storing and reading files for Korean LM training 💾☆35Oct 15, 2025Updated 3 months ago