π°π· Korean LLM Datasets | Pre-training, SFT, DPO, RLHF, CoT | νκ΅μ΄ LLM λ°μ΄ν°μ
νλ μ΄μ
β41Jan 20, 2026Updated 5 months ago
Alternatives and similar repositories for LLM-Ko-Datasets
Users that are interested in LLM-Ko-Datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Awesome-SLM: a curated list of Small Language Modelβ32Jun 24, 2024Updated 2 years ago
- β12Oct 3, 2024Updated last year
- β14Dec 22, 2024Updated last year
- langchain opentutorial utility package for Open Tutorialβ10Feb 2, 2025Updated last year
- 2024 PyCon Korea νν 리μΌβ12Nov 8, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Kor-IR: Korean Information Retrieval Benchmarkβ87Jul 3, 2024Updated last year
- β109Oct 13, 2025Updated 8 months ago
- These are papers that I read and reviewed related to NLP, CV, and Deep Learning π You can check paper links and my reviews πβ13Jan 3, 2024Updated 2 years ago
- Dataset Resplitting for Generalization in KGQA. See also https://github.com/semantic-systems/KGQA-datasetsβ17Jun 29, 2022Updated 4 years ago
- Making the transition from Scratch to Pythonβ10Apr 11, 2017Updated 9 years ago
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"β28Mar 2, 2026Updated 3 months ago
- "μ λμ€ λ¦¬λ μ€ μ Έ μ€ν¬λ¦½νΈ μμ μ¬μ : Unix & Linux Shell Script Exercise Dictionary" - νλΉλ―Έλμ΄β10Jan 17, 2017Updated 9 years ago
- These scripts clean the unused EBS volumes, AMIs and snapshots on Amazon Web Services.β11Jul 24, 2015Updated 10 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluationβ11May 27, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- κ°λ²Όμ΄ λ©ν°μμ΄μ νΈ μ€μΌμ€νΈλ μ΄μ μ νꡬνλ κ΅μ‘ νλ μμν¬μ λλ€. OpenAI μ루μ νμμ κ΄λ¦¬ν©λλ€.β16Oct 20, 2024Updated last year
- β83May 8, 2024Updated 2 years ago
- λͺ¨λμ AI μΌμΈμ Agentλ‘ μμ±νλ RAG κ°μ λ ν¬μ§ν 리μ λλ€.β19Dec 16, 2025Updated 6 months ago
- μΈμ’ ꡬ문 λΆμ λ§λμΉμ μμ‘΄ ꡬ문 ꡬ쑰λ‘μ λ³ν λꡬβ10Sep 7, 2018Updated 7 years ago
- A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.β11Nov 12, 2025Updated 7 months ago
- my-claude-code-assetβ122Apr 11, 2026Updated 2 months ago
- A collection of Python agent samples built with the Google Agent Development Kit (ADK), demonstrating integrations with services like Bβ¦β21May 8, 2026Updated last month
- AutoRAG example about benchmarking Korean embeddings.β45Oct 2, 2024Updated last year
- Created an inverted index in Python for document retreivalβ13Oct 7, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- It shows how to deploy and use an agent with LLM.β19Mar 1, 2025Updated last year
- From packpub bookβ15Mar 9, 2016Updated 10 years ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Mβ¦β28Mar 14, 2024Updated 2 years ago
- Huggies is a plug and play automation tool for AWS Elastic Beanstalkβ13Nov 8, 2017Updated 8 years ago
- 2019 κ΅μ΄κ²½μ§λν νκ΅μ΄ μ쑴ꡬ문 λΆμ λμ(λ¬Έμ²΄λΆ μ₯κ΄μ)β16Oct 26, 2022Updated 3 years ago
- π¦ νμ΄μ¬ νκΈ μ²λ¦¬ λΌμ΄λΈλ¬λ¦¬. Python Korean Morphological Analyzerβ19Feb 4, 2025Updated last year
- Korean Training Data Set Generator for Google Syntanxnetβ13Jun 27, 2017Updated 9 years ago
- μκ³ λ¦¬μ¦ κ΅¬νμΌλ‘ λ°°μ°λ μ νλμ with νμ΄μ¬β24Sep 21, 2023Updated 2 years ago
- Statistics and Visualization of acceptance rate, main keyword of NeurIPS 2020 accepted papersβ16Oct 12, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- β68Dec 29, 2025Updated 6 months ago
- bb25 is a fast, self-contained BM25 + Bayesian calibration implementation with a minimal Python API.β147Mar 17, 2026Updated 3 months ago
- Official Code Repository for Knowledge-Augmented Language Model Verification (EMNLP 2023)β28Dec 22, 2023Updated 2 years ago
- NLP μμ¬λΆν° μλΉκΉμ§ ν κΆμ μ± μμ λ€λ£Ήλλ€.β25Dec 6, 2025Updated 6 months ago
- Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.β34Dec 2, 2025Updated 6 months ago
- It shows a korean chatbot using LangChain based on Llama3β39Mar 1, 2025Updated last year
- BERTScore for Koreanβ80Feb 22, 2024Updated 2 years ago