Polyglot: Large Language Models of Well-balanced Competence in Multi-languages
☆484Aug 22, 2023Updated 2 years ago
Alternatives and similar repositories for polyglot
Users that are interested in polyglot are comparing it to the libraries listed below
Sorting:
- KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델 (KoAlpaca: An open-source language model to understand Korean instructions)☆1,576Oct 25, 2024Updated last year
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆452Apr 13, 2025Updated 10 months ago
- ☆123Apr 21, 2023Updated 2 years ago
- List of Korean pre-trained language models.☆188Aug 31, 2023Updated 2 years ago
- This repo is for Korean wiki table question answering datasets described in the paper of Korean-Specific Dataset for Table Question Answe…☆91Oct 22, 2024Updated last year
- Large-scale language modeling tutorials with PyTorch☆292Nov 2, 2021Updated 4 years ago
- ☁️ 구름(KULLM): 고려대학교에서 개발한, 한국어에 특화된 LLM☆589May 1, 2024Updated last year
- ☆100Apr 11, 2025Updated 10 months ago
- Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets☆130Nov 12, 2022Updated 3 years ago
- kogpt를 oslo로 파인튜닝하는 예제.☆23Aug 26, 2022Updated 3 years ago
- KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)☆1,014Jan 30, 2024Updated 2 years ago
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)☆119Oct 8, 2020Updated 5 years ago
- ☆106May 8, 2023Updated 2 years ago
- ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋☆40Nov 21, 2023Updated 2 years ago
- 🤗 최소한의 세팅으로 LM을 학습하기 위한 샘플코드☆59May 23, 2023Updated 2 years ago
- Official datasets and pytorch implementation repository of SQuARe and KoSBi (ACL 2023)☆248Jun 29, 2023Updated 2 years ago
- Pretrained ELECTRA Model for Korean☆630Feb 19, 2024Updated 2 years ago
- Open Korean NLP Dataset Curation for the Users All Around the Globe☆152Nov 18, 2023Updated 2 years ago
- 📖 Korean NLU Benchmark☆587Jul 6, 2022Updated 3 years ago
- Dataset of Korean Threatening Conversations☆72Nov 1, 2022Updated 3 years ago
- APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets☆77Feb 5, 2023Updated 3 years ago
- Korean-English Bilingual Electra Models☆110Nov 22, 2021Updated 4 years ago
- Pecab: Pure python Korean morpheme analyzer based on Mecab☆172Apr 27, 2024Updated last year
- ☆197May 22, 2023Updated 2 years ago
- ☆92Mar 3, 2022Updated 3 years ago
- Yet another python binding for mecab-ko☆88May 16, 2023Updated 2 years ago
- ☆148Jun 24, 2022Updated 3 years ago
- KSS: Korean String processing Suite☆468Nov 13, 2025Updated 3 months ago
- 한국어 데이터 세트 링크☆905Oct 14, 2024Updated last year
- OSLO: Open Source for Large-scale Optimization☆175Sep 9, 2023Updated 2 years ago
- 🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)☆201Dec 28, 2023Updated 2 years ago
- data related codebase for polyglot project☆19Mar 30, 2023Updated 2 years ago
- KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding☆310Jul 9, 2023Updated 2 years ago
- ☆36Oct 4, 2023Updated 2 years ago
- KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA)☆297Sep 20, 2024Updated last year
- BERTScore for Korean☆80Feb 22, 2024Updated 2 years ago
- OSLO: Open Source framework for Large-scale model Optimization☆309Aug 25, 2022Updated 3 years ago
- A BERT-based reverse dictionary of Korean proverbs☆97Feb 28, 2023Updated 3 years ago
- ☆19Sep 20, 2022Updated 3 years ago