YunwenTechnology / Chinese-Data-Distill-From-R1
中文基于满血DeepSeek-R1蒸馏数据集
☆56Updated 2 months ago
Alternatives and similar repositories for Chinese-Data-Distill-From-R1
Users that are interested in Chinese-Data-Distill-From-R1 are comparing it to the libraries listed below
Sorting:
- ☆226Updated last year
- 中文原生检索增强生成测评基准☆116Updated last year
- Alpaca Chinese Dataset -- 中文指令微调数据集☆203Updated 7 months ago
- 怎么训练一个LLM分词器☆144Updated last year
- ☆168Updated last year
- ☆143Updated 10 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆80Updated 8 months ago
- 文本去重☆71Updated 11 months ago
- 大语言模型指令调优工具(支持 FlashAttention)☆172Updated last year
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆86Updated 8 months ago
- This repository provides an implementation of the paper "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Co…☆68Updated 2 months ago
- 顾名思义:手搓的RAG☆122Updated last year
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆107Updated last year
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆56Updated last year
- ☆63Updated 2 years ago
- 大语言模型训练和服务调研☆37Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"