YongWookHa/kor-text-preprocess

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YongWookHa/kor-text-preprocess)

YongWookHa / kor-text-preprocess

Korean text data preprocess toolkit for NLP

☆18

Alternatives and similar repositories for kor-text-preprocess

Users that are interested in kor-text-preprocess are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

songys / 2021Langcon
View on GitHub
☆11Oct 3, 2021Updated 4 years ago
nlpai-lab / Korean-CommonGen
View on GitHub
[Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
☆11May 27, 2022Updated 4 years ago
JoungheeKim / kor-spacing
View on GitHub
This is project for korean auto spacing
☆12Aug 3, 2020Updated 5 years ago
korean-named-entity / konec
View on GitHub
Korean Named Entity Corpus
☆25May 12, 2023Updated 3 years ago
korean-named-entity / konne
View on GitHub
Korean Nested Named Entity Corpus
☆20May 13, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
smothly / bad-word-detection
View on GitHub
비속어 탐지 모델
☆16Dec 19, 2019Updated 6 years ago
ko-nlp / moducorpus-sanitizer
View on GitHub
모두의 말뭉치 데이터를 분석에 편리한 형태로 변환하는 기능을 제공합니다.
☆11Mar 2, 2022Updated 4 years ago
nlpai-lab / KommonGen
View on GitHub
한국어 생성 모델의 상식 추론을 위한 KommonGen 데이터셋입니다.
☆21Oct 5, 2021Updated 4 years ago
detail-novelist / novelist-triton-server
View on GitHub
Deploy KoGPT with Triton Inference Server
☆14Nov 18, 2022Updated 3 years ago
MrBananaHuman / PangyoCorpora
View on GitHub
☆38Oct 4, 2023Updated 2 years ago
openkorpos / model-mecab
View on GitHub
MeCab model trained with OpenKorPos.
☆23Jun 19, 2022Updated 4 years ago
baikalai / baikal-bert
View on GitHub
baikal.ai's pre-trained BERT models: descriptions and sample codes
☆12Jun 24, 2021Updated 5 years ago
jooinjang / Ko-ATOMIC
View on GitHub
Korean Commonsense Knowledge Graph
☆15Dec 23, 2022Updated 3 years ago
upskyy / Automatic-Speech-Recognition-Models
View on GitHub
End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
☆10Jan 21, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
QuoQA-NLP / Ko-conceptual-captions
View on GitHub
Google's Conceptual Captions Dataset translated into Korean
☆23Aug 28, 2022Updated 3 years ago
dobby-seo / kosr
View on GitHub
Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)
☆31Feb 19, 2021Updated 5 years ago
jason9693 / FROZEN
View on GitHub
☆14May 3, 2022Updated 4 years ago
passing2961 / EmoNSMC
View on GitHub
Korean large emotion labeled dataset (EmoNSMC)
☆14Mar 5, 2020Updated 6 years ago
jeongukjae / namuwiki-corpus
View on GitHub
문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.
☆19Jun 16, 2021Updated 5 years ago
BitnaKeum / Web_Crawler
View on GitHub
나무위키, 위키피디아, 다음블로그, 티스토리, 유튜브, 네이트판 크롤러
☆13Feb 20, 2026Updated 5 months ago
jeongukjae / korean-wikipedia-corpus
View on GitHub
문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.
☆24Sep 6, 2023Updated 2 years ago
snunlp / KR-ELECTRA
View on GitHub
KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch
☆15Feb 13, 2022Updated 4 years ago
Data-Intelligence-Lab / DEFT-korean-alpaca
View on GitHub
☆23Oct 30, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
songys / AwesomeKorean_Speech
View on GitHub
음성인식과 신호처리
☆14Sep 12, 2021Updated 4 years ago
upskyy / kf-deberta-multitask
View on GitHub
금융 도메인에 특화된 한국어 임베딩 모델
☆23Aug 8, 2024Updated last year
sooftware / nlp-tasks
View on GitHub
Natural Language Processing Tasks and Examples.
☆62Aug 17, 2022Updated 3 years ago
noowad93 / chosung-translator
View on GitHub
초성 해석기 based on ko-BART
☆29Mar 31, 2021Updated 5 years ago
korean-named-entity / konne-prep
View on GitHub
☆19Jan 29, 2023Updated 3 years ago
lovit / flask_api_tutorial
View on GitHub
Flask 로 API 를 만들기 위한 튜토리얼
☆10Jun 22, 2020Updated 6 years ago
Beomi / KcBERT-Finetune
View on GitHub
KcBERT/KcELECTRA Fine Tune Benchmarks code (forked from https://github.com/monologg/KoELECTRA/tree/master/finetune)
☆48Apr 10, 2022Updated 4 years ago
lovit / huggingface_konlpy
View on GitHub
Training Transformers of Huggingface with KoNLPy
☆68Aug 28, 2020Updated 5 years ago
lovit / petitions_archive
View on GitHub
청와대 국민청원 데이터 아카이브
☆16Aug 29, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
monologg / GoEmotions-Korean
View on GitHub
Korean version of GoEmotions Dataset 😍😢😱
☆57Jun 12, 2023Updated 3 years ago
lovit / namuwikitext
View on GitHub
Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)
☆53Oct 25, 2020Updated 5 years ago
passing2961 / KMRE
View on GitHub
Korean Moview Review Emotion (KMRE) Dataset
☆21Sep 7, 2020Updated 5 years ago
kakaoenterprise / KorAdvMRSTestData
View on GitHub
Adversarial Test Dataset for Korean Multi-turn Response Selection
☆34Dec 16, 2021Updated 4 years ago
J-Seo / KommonGen
View on GitHub
한국어 생성 모델의 상식 추론을 위한 KommonGen 데이터셋입니다.
☆17Oct 5, 2021Updated 4 years ago
SKplanet / Dialog-KoELECTRA
View on GitHub
ELECTRA기반 한국어 대화체 언어모델
☆54Aug 4, 2021Updated 4 years ago
SKTBrain / KVQA
View on GitHub
Korean Visual Question Answering
☆59Feb 18, 2020Updated 6 years ago