Korean text normalization and language preparation package for LM in Kaldi-based ASR system
☆63Apr 23, 2020Updated 5 years ago
Alternatives and similar repositories for KoLM
Users that are interested in KoLM are comparing it to the libraries listed below
Sorting:
- Korean grapheme-to-phone conversion in Python☆133Jan 27, 2020Updated 6 years ago
- Korean Speech to English Translation Corpus☆45Sep 3, 2021Updated 4 years ago
- ☆11Oct 3, 2021Updated 4 years ago
- 모두의 말뭉치 데이터를 분석에 편리한 형태로 변환하는 기능을 제공합니다.☆11Mar 2, 2022Updated 4 years ago
- MeCab model trained with OpenKorPos.☆23Jun 19, 2022Updated 3 years ago
- Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)☆31Feb 19, 2021Updated 5 years ago
- 📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.☆22Jul 12, 2019Updated 6 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Convert Numerical Representations to Korean Pronunciation☆14Apr 20, 2020Updated 5 years ago
- (semi) Grapheme-to-Phoneme (G2P) - seq2seq model using PyTorch for Korean☆23Dec 17, 2017Updated 8 years ago
- Megatron LM 11B on Huggingface Transformers☆27Jul 11, 2021Updated 4 years ago
- 문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.☆24Sep 6, 2023Updated 2 years ago
- 로봇의 감정 및 개성을 표현할 수 있는 대화형 음성합성 오픈소스 플랫폼☆108Feb 5, 2025Updated last year
- g2pK: g2p module for Korean☆266Mar 1, 2022Updated 4 years ago
- Training Transformers of Huggingface with KoNLPy☆68Aug 28, 2020Updated 5 years ago
- This is project for korean auto spacing☆12Aug 3, 2020Updated 5 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- Paper Review about Speech Recognition · NLP☆10Mar 25, 2021Updated 4 years ago
- ☆11Aug 12, 2020Updated 5 years ago
- Korean text data preprocess toolkit for NLP☆18Jun 11, 2019Updated 6 years ago
- A python script to convert namu wiki database to huge Korean language corpus☆30Apr 21, 2017Updated 8 years ago
- Repository for speech paper reading☆33Aug 19, 2021Updated 4 years ago
- Implementation of Korean FastSpeech2☆215Jan 29, 2023Updated 3 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- KoSentenceBERT 모델 구조 변경으로 성능 향상☆10Nov 22, 2020Updated 5 years ago
- Flask 로 API 를 만들기 위한 튜토리얼☆10Jun 22, 2020Updated 5 years ago
- ☆14Aug 16, 2023Updated 2 years ago
- 초성 해석기 based on ko-BART☆29Mar 31, 2021Updated 4 years ago
- CNN+BiLSTM 기반 한국어 개체명 인식기입니다☆57Nov 26, 2019Updated 6 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- Korean Moview Review Emotion (KMRE) Dataset☆21Sep 7, 2020Updated 5 years ago
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…☆23Aug 16, 2021Updated 4 years ago
- Review of papers I read☆14Dec 11, 2020Updated 5 years ago
- Dialogue generation models (GPT-2 and Meena) of Pingpong, ScatterLab.☆21Nov 15, 2021Updated 4 years ago
- Wav2Vec2 finetune and inference code for IITP AI Grand Challenge☆36Feb 22, 2022Updated 4 years ago
- Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)☆53Oct 25, 2020Updated 5 years ago
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- 🤗 최소한의 세팅으로 LM을 학습하기 위한 샘플코드☆59May 23, 2023Updated 2 years ago