Sunkyoung/Compare-tokenizer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Sunkyoung/Compare-tokenizer)

Sunkyoung / Compare-tokenizer

Tokenizer 비교 실험

☆11

Alternatives and similar repositories for Compare-tokenizer

Users that are interested in Compare-tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kimsehwan96 / pyjosa
View on GitHub
간단한 파이썬 🇰🇷 한글 조사처리 라이브러리 은/는 와/과 이/가 등을 처리합니다. PyPI에 배포한 오픈소스 프로젝트입니다.
☆25Jul 6, 2021Updated 5 years ago
naver-ai / carecall-corpus
View on GitHub
CareCall for Seniors: Role Specified Open-Domain Dialogue dataset generated by leveraging LLMs (NAACL 2022).
☆62May 3, 2022Updated 4 years ago
songys / AwesomeKorean_Speech
View on GitHub
음성인식과 신호처리
☆14Sep 12, 2021Updated 4 years ago
Beomi / exbert-transformers
View on GitHub
exBERT on Transformers🤗
☆10Jun 14, 2021Updated 5 years ago
monologg / dotfiles
View on GitHub
Simple setup for personal dotfiles
☆11Jul 4, 2026Updated 2 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jooinjang / Ko-ATOMIC
View on GitHub
Korean Commonsense Knowledge Graph
☆15Dec 23, 2022Updated 3 years ago
noowad93 / chosung-translator
View on GitHub
초성 해석기 based on ko-BART
☆29Mar 31, 2021Updated 5 years ago
jucho2725 / ktextaug
View on GitHub
Data Augmentation Toolkit for Korean text.
☆52Nov 16, 2021Updated 4 years ago
SKTBrain / KVQA
View on GitHub
Korean Visual Question Answering
☆59Feb 18, 2020Updated 6 years ago
eunki7 / python_create_app_1
View on GitHub
Python Class Source Files
☆13Dec 27, 2019Updated 6 years ago
Beomi / KcBERT-Finetune
View on GitHub
KcBERT/KcELECTRA Fine Tune Benchmarks code (forked from https://github.com/monologg/KoELECTRA/tree/master/finetune)
☆48Apr 10, 2022Updated 4 years ago
monologg / korean-hate-speech-koelectra
View on GitHub
Bias, Hate classification with KoELECTRA 👿
☆27Jun 12, 2023Updated 3 years ago
baikalai / baikal-bert
View on GitHub
baikal.ai's pre-trained BERT models: descriptions and sample codes
☆12Jun 24, 2021Updated 5 years ago
jeongukjae / namuwiki-corpus
View on GitHub
문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.
☆19Jun 16, 2021Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
deepspeedai / deepspeed-gpt-neox
View on GitHub
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
☆21Nov 28, 2022Updated 3 years ago
kakaobrain / kortok
View on GitHub
The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)
☆119Oct 8, 2020Updated 5 years ago
AIRC-KETI / ke-t5-downstreams
View on GitHub
☆39Mar 25, 2024Updated 2 years ago
monologg / ko_lm_dataformat
View on GitHub
A utility for storing and reading files for Korean LM training 💾
☆35Updated this week
hyunwoongko / megatron-11b
View on GitHub
Megatron LM 11B on Huggingface Transformers
☆28Jul 11, 2021Updated 5 years ago
jason9693 / SoongsilBERT-base-beep-deploy
View on GitHub
☆21Apr 16, 2022Updated 4 years ago
lugiavn / generalization-dml
View on GitHub
Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?
☆11Jan 3, 2019Updated 7 years ago
lovit / KoBERTScore
View on GitHub
BERTScore for Korean
☆81Feb 22, 2024Updated 2 years ago
passing2961 / EmoNSMC
View on GitHub
Korean large emotion labeled dataset (EmoNSMC)
☆14Mar 5, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hyunwoongko / kobart-transformers
View on GitHub
Kobart model on Huggingface transformers
☆64Feb 15, 2022Updated 4 years ago
jeongukjae / korean-spacing-model
View on GitHub
한국어 문장 띄어쓰기(삭제/추가) 모델입니다. 데이터 준비 후 직접 학습이 가능하도록 작성하였습니다.
☆56Jul 11, 2022Updated 4 years ago
hyunwoongko / bert2bert-summarization
View on GitHub
Abstractive summarization using Bert2Bert framework.
☆31Dec 5, 2020Updated 5 years ago
smothly / bad-word-detection
View on GitHub
비속어 탐지 모델
☆16Dec 19, 2019Updated 6 years ago
bab2min / corpus
View on GitHub
개인적으로 수집한 한국어 NLP용 말뭉치 모음
☆140Sep 15, 2020Updated 5 years ago
SKplanet / Dialog-KoELECTRA
View on GitHub
ELECTRA기반 한국어 대화체 언어모델
☆54Aug 4, 2021Updated 4 years ago
toriving / KoEDA
View on GitHub
Korean Easy Data Augmentation
☆91Sep 30, 2021Updated 4 years ago
jason9693 / Soongsil-BERT
View on GitHub
숭실대학교 커뮤니티용 언어모델
☆43Nov 6, 2021Updated 4 years ago
hamanlp / hama-py
View on GitHub
🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
☆19Feb 4, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
korean-named-entity / konne-prep
View on GitHub
☆19Jan 29, 2023Updated 3 years ago
korean-named-entity / konne
View on GitHub
Korean Nested Named Entity Corpus
☆20May 13, 2023Updated 3 years ago
gtolias / mkd
View on GitHub
MATLAB implementation of the multiple-kernel local-patch descriptor (BMVC 2017 paper)
☆14Jan 31, 2018Updated 8 years ago
LG-AI-EXAONE / KMMLU-Pro
View on GitHub
☆16Aug 18, 2025Updated 11 months ago
smilegate-ai / HuLiC
View on GitHub
☆93Mar 3, 2022Updated 4 years ago
paul-hyun / tf_transformers
View on GitHub
Tensorflow 2.0 Transoformer, gpt, bert, 기타 등등
☆11Apr 21, 2023Updated 3 years ago
BM-K / Styling-Chatbot-with-Transformer
View on GitHub
Language Style과 감정에 따른 챗봇 답변 변화 모델
☆33Aug 17, 2021Updated 4 years ago