Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper
☆51Jun 12, 2023Updated 2 years ago
Alternatives and similar repositories for transformers-data-augmentation
Users that are interested in transformers-data-augmentation are comparing it to the libraries listed below
Sorting:
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆135Jun 12, 2023Updated 2 years ago
- ☆25Oct 28, 2020Updated 5 years ago
- A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)☆28May 21, 2021Updated 4 years ago
- ☆65May 11, 2022Updated 3 years ago
- 🐸 KERMIT - A lightweight library to encode and interpret Universal Syntactic Embeddings☆58Jan 18, 2023Updated 3 years ago
- ☆19Apr 1, 2022Updated 3 years ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022☆10Jan 6, 2023Updated 3 years ago
- PyTorch code for meta seq2seq learning☆43Jan 14, 2020Updated 6 years ago
- KoSentenceBERT 모델 구조 변경으로 성능 향상☆10Nov 22, 2020Updated 5 years ago
- Tokenizer 비교 실험☆11Jan 3, 2022Updated 4 years ago
- ☆12Mar 8, 2020Updated 5 years ago
- exBERT on Transformers🤗☆10Jun 14, 2021Updated 4 years ago
- KoBART chatbot☆45Jun 22, 2021Updated 4 years ago
- Code for the paper "What Makes Better Augmentation Strategies? Augment Difficult but Not too Different" (ICLR 22)☆12Aug 28, 2023Updated 2 years ago
- https://ailabs.enliple.com/☆105Feb 25, 2021Updated 5 years ago
- ☆12Nov 11, 2019Updated 6 years ago
- ☆14Nov 10, 2021Updated 4 years ago
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- ☆14Sep 29, 2025Updated 5 months ago
- Korean Training Data Set Generator for Google Syntanxnet☆13Jun 27, 2017Updated 8 years ago
- Facilitate the learning, practicing, and designing of neural text matching models with a user-friendly and interactive interface.☆41Dec 8, 2022Updated 3 years ago
- Code for text augmentation method leveraging large-scale language models☆61Dec 20, 2021Updated 4 years ago
- GluonNLP tutorial for Pycon2019☆14Aug 16, 2019Updated 6 years ago
- KoGPT2 on Huggingface Transformers☆33May 4, 2021Updated 4 years ago
- A utility for storing and reading files for Korean LM training 💾☆35Oct 15, 2025Updated 4 months ago
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- 🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer☆19Feb 4, 2025Updated last year
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆18Nov 26, 2023Updated 2 years ago
- EDA를 한국어 데이터에서도 사용할 수 있도록 WordNet을 추가☆104Apr 29, 2020Updated 5 years ago
- Kobart model on Huggingface transformers☆64Feb 15, 2022Updated 4 years ago
- Tutorial for pretraining Korean GPT-2 model☆67Jun 12, 2023Updated 2 years ago
- 2019 국어경진대회 한국어 의존구문 분석 대상(문체부 장관상)☆15Oct 26, 2022Updated 3 years ago
- Material for UW Extension Data Science 350☆19Dec 31, 2017Updated 8 years ago
- Training Transformers of Huggingface with KoNLPy☆68Aug 28, 2020Updated 5 years ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Aug 20, 2024Updated last year
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)☆119Oct 8, 2020Updated 5 years ago
- question generation model with KorQuAD dataset☆37Sep 6, 2021Updated 4 years ago