Beomi / transformers-language-modeling
Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3
☆23Updated 3 years ago
Alternatives and similar repositories for transformers-language-modeling:
Users that are interested in transformers-language-modeling are comparing it to the libraries listed below
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆14Updated last year
- Beyond LM: How can language model go forward in the future?☆15Updated last year
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Updated last year
- Abstractive summarization using Bert2Bert framework.☆31Updated 4 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆12Updated 2 years ago
- Hate speech detection corpus in Korean, shared with EMNLP 2023 paper☆13Updated 9 months ago
- Megatron LM 11B on Huggingface Transformers☆27Updated 3 years ago
- KLUE Benchmark 1st place (2021.12) solutions. (RE, MRC, NLI, STS, TC)☆25Updated 2 years ago
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆24Updated last year
- Korean Named Entity Corpus☆25Updated last year
- Korean Abstract Meaning Representation (AMR) Corpus☆10Updated 2 years ago
- ☆26Updated 4 years ago
- Official code and dataset repository of KoBBQ (TACL 2024)☆14Updated 8 months ago
- ☆10Updated 2 months ago
- 2019 국어경진대회 한국어 의존구문 분석 대상(문체부 장관상)☆16Updated 2 years ago
- Don't Judge a Language Model by Its Last Layer: Contrastive Learning with Layer-Wise Attention Pooling☆9Updated 2 years ago
- ☆19Updated 2 years ago
- Korean Nested Named Entity Corpus☆18Updated last year
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆13Updated last year
- Korean Commonsense Knowledge Graph☆14Updated 2 years ago
- Machine Generated Captions for Best Artworks☆22Updated 2 years ago
- 매주 목요일, 20:00 모임☆16Updated 4 years ago
- Script to pre-train hugginface transformers BART with Tensorflow 2☆33Updated last year
- Pre-training BART in Flax on The Pile dataset☆20Updated 3 years ago
- Calculating Expected Time for training LLM.☆38Updated last year
- ☆11Updated 4 years ago