๐ฐ๐ท Korean LLM Datasets | Pre-training, SFT, DPO, RLHF, CoT | ํ๊ตญ์ด LLM ๋ฐ์ดํฐ์
ํ๋ ์ด์
โ41Jan 20, 2026Updated 4 months ago
Alternatives and similar repositories for LLM-Ko-Datasets
Users that are interested in LLM-Ko-Datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Awesome-SLM: a curated list of Small Language Modelโ30Jun 24, 2024Updated last year
- โ12Oct 3, 2024Updated last year
- โ14Dec 22, 2024Updated last year
- 2024 PyCon Korea ํํ ๋ฆฌ์ผโ12Nov 8, 2024Updated last year
- Kor-IR: Korean Information Retrieval Benchmarkโ87Jul 3, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer โข AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- โ109Oct 13, 2025Updated 7 months ago
- Dataset Resplitting for Generalization in KGQA. See also https://github.com/semantic-systems/KGQA-datasetsโ17Jun 29, 2022Updated 3 years ago
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"โ28Mar 2, 2026Updated 3 months ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluationโ11May 27, 2022Updated 4 years ago
- โ83May 8, 2024Updated 2 years ago
- ๋ชจ๋์ AI ์ผ์ธ์ Agent๋ก ์์ฑํ๋ RAG ๊ฐ์ ๋ ํฌ์งํ ๋ฆฌ์ ๋๋ค.โ19Dec 16, 2025Updated 5 months ago
- โ64Jul 21, 2025Updated 10 months ago
- ๐ฌA curated list of incredible amount of publications related to Dialogue Systems especially Chatbots and Chit-chat Systemsโ10Dec 5, 2019Updated 6 years ago
- ์ธ์ข ๊ตฌ๋ฌธ ๋ถ์ ๋ง๋ญ์น์ ์์กด ๊ตฌ๋ฌธ ๊ตฌ์กฐ๋ก์ ๋ณํ ๋๊ตฌโ10Sep 7, 2018Updated 7 years ago
- Open source password manager - Proton Pass โข AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- my-claude-code-assetโ122Apr 11, 2026Updated last month
- A collection of Python agent samples built with the Google Agent Development Kit (ADK), demonstrating integrations with services like Bโฆโ21May 8, 2026Updated last month
- AutoRAG example about benchmarking Korean embeddings.โ45Oct 2, 2024Updated last year
- Created an inverted index in Python for document retreivalโ13Oct 7, 2018Updated 7 years ago
- KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorchโ15Feb 13, 2022Updated 4 years ago
- โ19Sep 3, 2024Updated last year
- It shows how to deploy and use an agent with LLM.โ20Mar 1, 2025Updated last year
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Mโฆโ28Mar 14, 2024Updated 2 years ago
- 2019 ๊ตญ์ด๊ฒฝ์ง๋ํ ํ๊ตญ์ด ์์กด๊ตฌ๋ฌธ ๋ถ์ ๋์(๋ฌธ์ฒด๋ถ ์ฅ๊ด์)โ15Oct 26, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ๐ฆ ํ์ด์ฌ ํ๊ธ ์ฒ๋ฆฌ ๋ผ์ด๋ธ๋ฌ๋ฆฌ. Python Korean Morphological Analyzerโ19Feb 4, 2025Updated last year
- Korean Training Data Set Generator for Google Syntanxnetโ13Jun 27, 2017Updated 8 years ago
- Statistics and Visualization of acceptance rate, main keyword of NeurIPS 2020 accepted papersโ16Oct 12, 2020Updated 5 years ago
- bb25 is a fast, self-contained BM25 + Bayesian calibration implementation with a minimal Python API.โ147Mar 17, 2026Updated 2 months ago
- โ35Mar 22, 2026Updated 2 months ago
- Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.โ34Dec 2, 2025Updated 6 months ago
- It shows a korean chatbot using LangChain based on Llama3โ39Mar 1, 2025Updated last year
- BERTScore for Koreanโ80Feb 22, 2024Updated 2 years ago
- Federated learning based appโ12May 11, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer โข AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- (Deprecated) use open-korean-textโ12Jul 3, 2018Updated 7 years ago
- This repo is for Korean wiki table question answering datasets described in the paper of Korean-Specific Dataset for Table Question Answeโฆโ91Oct 22, 2024Updated last year
- Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, mโฆโ57Jun 3, 2025Updated last year
- Create and manage Amazon SageMaker HyperPod clusters, run distributed model trainingโ24May 21, 2026Updated 2 weeks ago
- โ27Jan 6, 2023Updated 3 years ago
- SageMaker-based fine-tuning and deployment hands-on example of a Korean NLP downstream task. Recommended for customers considering adoptiโฆโ15Nov 28, 2022Updated 3 years ago
- huggingface์ ์๋ ํ๊ตญ์ด ๋ฐ์ดํฐ ์ธํธโ36Oct 10, 2024Updated last year