HAE-RAE / haerae-evaluation-toolkitView external linksLinks
The most modern LLM evaluation toolkit
☆70Nov 9, 2025Updated 3 months ago
Alternatives and similar repositories for haerae-evaluation-toolkit
Users that are interested in haerae-evaluation-toolkit are comparing it to the libraries listed below
Sorting:
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- Official repository for KoMT-Bench built by LG AI Research☆71Aug 8, 2024Updated last year
- Performs benchmarking on two Korean datasets with minimal time and effort.☆45Jan 22, 2026Updated 3 weeks ago
- ☆114Jul 30, 2025Updated 6 months ago
- 한국어 벤치마크 평가 코드 통합본(?)☆20Nov 15, 2024Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year
- ☆20Jul 24, 2024Updated last year
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated 10 months ago
- huggingface에 있는 한국어 데이터 세트☆35Oct 10, 2024Updated last year
- hwplib 패키지 python에서 쉽게 사용 할수 있게 만든 github repo 입니다.☆55Mar 29, 2025Updated 10 months ago
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Updated this week
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆452Apr 13, 2025Updated 10 months ago
- 한국어 LLM 리더보드 및 모델 성능/안전성 관리☆22Sep 26, 2023Updated 2 years ago
- ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋☆40Nov 21, 2023Updated 2 years ago
- This is a hands-on for ML beginners to perform SimCSE step-by-step. Implemented both supervised SimCSE and unsupervisied SimCSE, and dist…☆22Oct 6, 2023Updated 2 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated last year
- 한국어 언어모델 다분야 사고력 벤치마크☆201Oct 17, 2024Updated last year
- Benchmark in Korean Context☆136Sep 26, 2023Updated 2 years ago
- ☆64Jul 21, 2025Updated 6 months ago
- bpe based korean t5 model for text-to-text unified framework☆63Apr 17, 2024Updated last year
- Kor-IR: Korean Information Retrieval Benchmark☆87Jul 3, 2024Updated last year
- KURE: 고려대학교에서 개발한, 한국어 검색에 특화된 임베딩 모델☆206Sep 10, 2025Updated 5 months ago
- CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean☆47Dec 23, 2024Updated last year
- Open Source + Multilingual MLLM + Fine-tuning + Distillation + More efficient models and learning + ?☆18Jan 31, 2025Updated last year
- Developing a Korean LLM model : Hate Speech Filtering, Improving conversational skills, Finetuning with the RLHF method☆19May 27, 2025Updated 8 months ago
- Official datasets and pytorch implementation repository of SQuARe and KoSBi (ACL 2023)☆248Jun 29, 2023Updated 2 years ago
- The Universe of Evaluation. All about the evaluation for LLMs.☆232Jul 9, 2024Updated last year
- 한국어 뉴스의 긍정, 부정이 레이블링 된 금융 뉴스 문장 감성 분석 데이터셋 (finance sentiment corpus) 입니다.☆109Nov 3, 2023Updated 2 years ago
- evolve llm training instruction, from english instruction to any language.☆119Sep 15, 2023Updated 2 years ago
- [KO-Platy🥮] Korean-Open-platypus를 활용하여 llama-2-ko를 fine-tuning한 KO-platypus model☆73Aug 24, 2025Updated 5 months ago
- ☆61Sep 18, 2025Updated 4 months ago
- Yet another python binding for mecab-ko☆88May 16, 2023Updated 2 years ago
- ☆19Oct 24, 2023Updated 2 years ago
- Evaluating Multimodal Generative AI with Korean Educational Standards, NAACL 2025.☆24May 15, 2025Updated 9 months ago
- Repository for KDA(Knowledge-dependent Answerability), EMNLP 2022 work☆13Feb 27, 2023Updated 2 years ago
- Unofficial API for CLOVA X☆37Dec 27, 2023Updated 2 years ago
- 어린이를 위한 동화 제작 서비스, My AI Fairy-Tale☆11Apr 7, 2023Updated 2 years ago
- Project of llm evaluation to Japanese tasks☆91Feb 4, 2026Updated last week
- ☆123Apr 21, 2023Updated 2 years ago