EleutherAI / polyglot
Polyglot: Large Language Models of Well-balanced Competence in Multi-languages
☆482Updated last year
Alternatives and similar repositories for polyglot
Users that are interested in polyglot are comparing it to the libraries listed below
Sorting:
- Large-scale language modeling tutorials with PyTorch☆290Updated 3 years ago
- Korean Multi-task Instruction Tuning☆158Updated last year
- Data processing system for polyglot☆91Updated last year
- ☆196Updated last year
- Korean Sentence Embedding Repository☆209Updated 5 months ago
- Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets☆129Updated 2 years ago
- ☆105Updated 2 years ago
- KoCLIP: Korean port of OpenAI CLIP, in Flax☆151Updated last year
- List of Korean pre-trained language models.☆185Updated last year
- Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)☆47Updated last year
- 🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)☆202Updated last year
- [2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.☆128Updated 2 years ago
- Forked repo from https://github.com/EleutherAI/lm-evaluation-harness/commit/1f66adc☆76Updated last year
- Benchmark in Korean Context☆131Updated last year
- OSLO: Open Source for Large-scale Optimization☆175Updated last year
- Open Korean NLP Dataset Curation for the Users All Around the Globe☆147Updated last year
- [KO-Platy🥮] Korean-Open-platypus를 활용하여 llama-2-ko를 fine-tuning한 KO-platypus model☆75Updated last year
- KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding☆305Updated last year
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆415Updated last month
- Official datasets and pytorch implementation repository of SQuARe and KoSBi (ACL 2023)☆241Updated last year
- Pecab: Pure python Korean morpheme analyzer based on Mecab☆165Updated last year
- IA3방식으로 KoAlpaca를 fine tuning한 한국어 LLM모델☆68Updated last year
- The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)☆118Updated 4 years ago
- OSLO: Open Source framework for Large-scale model Optimization☆308Updated 2 years ago
- Curation note of NLP datasets☆96Updated 2 years ago
- Pretrained Language Models for Korean☆391Updated 2 years ago
- evolve llm training instruction, from english instruction to any language.☆117Updated last year
- ☆124Updated 2 years ago
- 🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization…☆230Updated last year
- The Universe of Evaluation. All about the evaluation for LLMs.☆225Updated 10 months ago