insikk/namu_wiki_db_preprocess

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/insikk/namu_wiki_db_preprocess)

insikk / namu_wiki_db_preprocess

A python script to convert namu wiki database to huge Korean language corpus

☆29

Alternatives and similar repositories for namu_wiki_db_preprocess

Users that are interested in namu_wiki_db_preprocess are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

j-min / Easy-Namuwiki-Extractor
View on GitHub
Easy Namuwiki Extractor
☆29Nov 29, 2016Updated 9 years ago
lovit / kmrd
View on GitHub
Synthetic dataset for recommender system created from Naver Movie rating system
☆26Dec 8, 2023Updated 2 years ago
warnikchow / 3i4k
View on GitHub
Intonation-aided intention identification for Korean
☆82Nov 21, 2022Updated 3 years ago
passing2961 / KMRE
View on GitHub
Korean Moview Review Emotion (KMRE) Dataset
☆21Sep 7, 2020Updated 5 years ago
machinereading / kor-re-gold
View on GitHub
Korean Relation Extraction Gold Standard
☆35May 31, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lovit / sejong_corpus_cleaner
View on GitHub
세종 말뭉치 데이터를 정제하기 위한 utils
☆37Sep 30, 2019Updated 6 years ago
enlipleai / kor_pretrain_LM
View on GitHub
https://ailabs.enliple.com/
☆105Feb 25, 2021Updated 5 years ago
hyunwoongko / kocrawl
View on GitHub
Collection of useful Korean crawlers
☆89May 22, 2023Updated 3 years ago
songys / Question_pair
View on GitHub
#Paired Question
☆24Jun 16, 2020Updated 6 years ago
lovit / namuwikitext
View on GitHub
Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)
☆53Oct 25, 2020Updated 5 years ago
monologg / GoEmotions-Korean
View on GitHub
Korean version of GoEmotions Dataset 😍😢😱
☆57Jun 12, 2023Updated 3 years ago
scarletcho / KoLM
View on GitHub
Korean text normalization and language preparation package for LM in Kaldi-based ASR system
☆64Apr 23, 2020Updated 6 years ago
lovit / flask_api_tutorial
View on GitHub
Flask 로 API 를 만들기 위한 튜토리얼
☆10Jun 22, 2020Updated 6 years ago
likejazz / korean-sentence-splitter
View on GitHub
Split Korean text into sentences using heuristic algorithm.
☆216Dec 24, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Sunkyoung / Compare-tokenizer
View on GitHub
Tokenizer 비교 실험
☆11Jan 3, 2022Updated 4 years ago
dsindex / iclassifier
View on GitHub
reference pytorch code for intent classification
☆44Oct 18, 2024Updated last year
jeongukjae / korean-wikipedia-corpus
View on GitHub
문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.
☆24Sep 6, 2023Updated 2 years ago
e9t / nsmc
View on GitHub
Naver sentiment movie corpus
☆603Mar 7, 2017Updated 9 years ago
ModuNLP / hacking_transformers
View on GitHub
☆11Aug 12, 2020Updated 5 years ago
insikk / bow_image_retrieval
View on GitHub
Bag-of-words Image Retrieval
☆17Jan 3, 2018Updated 8 years ago
coolengineer / sejong-corpus
View on GitHub
Korean sejong corpus download and simple analysis
☆151May 9, 2019Updated 7 years ago
monologg / DistilKoBERT
View on GitHub
Distillation of KoBERT from SKTBrain (Lightweight KoBERT)
☆200Sep 6, 2023Updated 2 years ago
jeongukjae / korean-spacing-model
View on GitHub
한국어 문장 띄어쓰기(삭제/추가) 모델입니다. 데이터 준비 후 직접 학습이 가능하도록 작성하였습니다.
☆56Jul 11, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
doublems / korean-bad-words
View on GitHub
I hope to this list will contribute good influence in Korean online services.
☆63Feb 10, 2019Updated 7 years ago
nawnoes / pytorch-meena
View on GitHub
Open-domain chatbot (Meena-style) with a vanilla Transformer seq2seq in PyTorch.
☆27Jan 12, 2026Updated 6 months ago
nlp-kkmas / korean-embedding-study
View on GitHub
이기창(ratsgo)님의 자연어 처리 저서 '한국어 임베딩' 스터디 기록 저장소 [DONE]
☆23Jan 15, 2020Updated 6 years ago
lyeoni / KorQuAD
View on GitHub
KorQuAD (Korean Question Answering Dataset) submission guide using PyTorch pretrained BERT
☆31Jun 18, 2019Updated 7 years ago
seujung / t5-summarization
View on GitHub
☆25Oct 28, 2020Updated 5 years ago
human-rights-corpus / HRC
View on GitHub
#인권코퍼스
☆31Oct 6, 2023Updated 2 years ago
lovit / kowikitext
View on GitHub
☆19Jan 17, 2021Updated 5 years ago
SungjoonPark / KoreanWordVectors
View on GitHub
Subword-level Word Vector Representations for Korean (ACL 2018)
☆107Oct 17, 2019Updated 6 years ago
hanjanghoon / NLP_Koeran_DP
View on GitHub
2019 국어경진대회 한국어 의존구문 분석 대상(문체부 장관상)
☆16Oct 26, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
warnikchow / sae4k
View on GitHub
Structured argument extraction for Korean
☆22Feb 17, 2022Updated 4 years ago
emorynlp / ud-korean
View on GitHub
Universal Dependency Treebanks in Korean
☆39Dec 19, 2021Updated 4 years ago
eagle705 / korean-ner-cnn-bilstm
View on GitHub
CNN+BiLSTM 기반 한국어 개체명 인식기입니다
☆57Nov 26, 2019Updated 6 years ago
snoop2head / KLUE-RBERT
View on GitHub
↔️ Utilizing RBERT model structure for KLUE Relation Extraction task
☆15Nov 15, 2022Updated 3 years ago
sunggukcha / xor
View on GitHub
xor activation
☆26Jan 6, 2020Updated 6 years ago
seopbo / nlp_classification
View on GitHub
Implementing nlp papers relevant to classification with PyTorch, gluonnlp
☆229Dec 8, 2022Updated 3 years ago
bluedisk / hangul-toolkit
View on GitHub
한글 자모 분리/조합 작업을 위한 툴킷
☆296Nov 1, 2024Updated last year