EleutherAI / dps
Data processing system for polyglot
☆90Updated last year
Related projects ⓘ
Alternatives and complementary repositories for dps
- Official repository for KoMT-Bench built by LG AI Research☆49Updated 3 months ago
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Updated 2 months ago
- [Google Meet] MLLM Arxiv Casual Talk☆55Updated last year
- 🤗 최소한의 세팅으로 LM을 학습하기 위한 샘플코드☆57Updated last year
- ☆32Updated last year
- 한국어 LLM 리더보드 및 모델 성능/안전성 관리☆22Updated last year
- ☆18Updated 3 months ago
- bpe based korean t5 model for text-to-text unified framework☆63Updated 7 months ago
- Korean Math Word Problems☆57Updated 2 years ago
- Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets☆127Updated 2 years ago
- ☆33Updated last year
- 한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)☆30Updated last year
- 자체 구축한 한국어 평가 데이터셋을 이용한 한국어 모델 평가☆30Updated 5 months ago
- CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean☆41Updated 2 months ago
- CareCall for Seniors: Role Specified Open-Domain Dialogue dataset generated by leveraging LLMs (NAACL 2022).☆59Updated 2 years ago
- ☆19Updated 2 years ago
- Adversarial Test Dataset for Korean Multi-turn Response Selection☆34Updated 2 years ago
- Reward Model을 이용하여 언어모델의 답변을 평가하기☆27Updated 8 months ago
- kogpt를 oslo로 파인튜닝하는 예제.☆23Updated 2 years ago
- StrategyQA 데이터 세트 번역☆20Updated 7 months ago
- [KO-Platy🥮] Korean-Open-platypus를 활용하여 llama-2-ko를 fine-tuning한 KO-platypus model☆77Updated last year
- Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)☆45Updated 8 months ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆28Updated last year
- ☆18Updated last year
- Dataset of Korean Threatening Conversations☆70Updated 2 years ago
- KURE: 고려대학교에서 개발한, 한국어 검색에 특화된 임베딩 모델☆36Updated 3 weeks ago
- ☆26Updated last year
- data related codebase for polyglot project☆19Updated last year
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Updated last year