jonghwanhyeon / namu-wiki-extractorLinks
A library to extract plaintexts from the JSON dump file of namu wiki
☆26Updated 3 years ago
Alternatives and similar repositories for namu-wiki-extractor
Users that are interested in namu-wiki-extractor are comparing it to the libraries listed below
Sorting:
- Parallel dataset of Korean Questions and Commands☆60Updated 2 years ago
- KoBART chatbot☆46Updated 4 years ago
- Kobart model on Huggingface transformers☆64Updated 3 years ago
- KoGPT2 on Huggingface Transformers☆33Updated 4 years ago
- APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets☆77Updated 2 years ago
- Training Transformers of Huggingface with KoNLPy☆68Updated 5 years ago
- This repository contains Korean Hate Speech dataset for paper, "K-MHaS: A Multi-label Hate Speech Detection Dataset in Korean Online News…☆50Updated last year
- 한국어 높임말 교정☆26Updated 3 years ago
- 한국어 T5 모델☆54Updated 4 years ago
- #Paired Question☆24Updated 5 years ago
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)☆119Updated 5 years ago
- Dataset of Korean Threatening Conversations☆72Updated 3 years ago
- 최신 자연어처리 모 델 소개☆74Updated 3 years ago
- 🤗 최소한의 세팅으로 LM을 학습하기 위한 샘플코드☆59Updated 2 years ago
- Korean Math Word Problems☆59Updated 4 years ago
- ☆74Updated 3 years ago
- Bias, Hate classification with KoELECTRA 👿☆27Updated 2 years ago
- huggingface를 이용하여 downstream task 수행하기☆63Updated 4 years ago
- 나무위키덤프에서 정제된 텍스트를 얻기 위한 NamuwikiExtractor☆19Updated 3 years ago
- 한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)☆32Updated 2 years ago
- Korean BERT model using character tokenizer☆27Updated 4 years ago
- Korean Online That-gul Emotions Dataset☆129Updated 2 years ago
- BERTScore for Korean☆81Updated last year
- ☆21Updated 3 years ago
- Data Augmentation Toolkit for Korean text.☆52Updated 4 years ago
- ☆19Updated 3 years ago
- Character-level Korean ELECTRA Model (음절 단위 한국어 ELECTRA)☆54Updated 2 years ago
- 특허분야 특화된 한국어 AI언어모델 KorPatBERT☆67Updated 2 years ago
- Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)☆52Updated 5 years ago
- KcBERT/KcELECTRA Fine Tune Benchmarks code (forked from https://github.com/monologg/KoELECTRA/tree/master/finetune)☆47Updated 3 years ago