sun1638650145 / deep-rl-class-zhLinks
Hugging Face 深度强化学习课程(中文版)
☆20Updated 2 years ago
Alternatives and similar repositories for deep-rl-class-zh
Users that are interested in deep-rl-class-zh are comparing it to the libraries listed below
Sorting:
- 首个llama2 13b 中文版模型 (Base + 中文对话SFT,实现流畅多轮人机自然语言交互)☆90Updated last year
- 大语言模型训练和服务调研☆37Updated last year
- 千问14B和7B的逐行解释☆60Updated last year
- 《解构大语言模型:从线性回归到通用人工智能》配套代码☆218Updated 6 months ago
- Awesome Colab Projects Collection☆27Updated last year
- 本项目致力于为大模型领域的初学者提供全面的知识体系,包括基础和高阶内容,以便开发者能迅速掌握大模型技术栈并全面了解相关知识。☆61Updated 6 months ago
- Alpaca Chinese Dataset -- 中文指令微调数据集☆209Updated 9 months ago
- deep learning☆148Updated 2 months ago
- bilibili video course src code☆363Updated last year
- qwen models finetuning☆100Updated 4 months ago
- SuperCLUE琅琊榜:中文通用大模型匿名对战评价基准☆145Updated last year
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆263Updated 4 months ago
- 大型语言模型实战指南:应用实践与场景落地☆74Updated 10 months ago
- 演示Gemma中文指令微调的教程☆46Updated last year
- 骆驼大乱斗: Massive Game Content Generated by LLM☆19Updated last year
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆217Updated last year
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆108Updated last year
- 想要从零开始训练一个中文的mini大语言模型,可以进行基本的对话,模型大小根据手头的机器决定☆60Updated 11 months ago
- B站视频课程配套资料☆39Updated 2 years ago
- 解锁HuggingFace生态的百般用法☆93Updated 7 months ago
- GRAIN: Gradient-based Intra-attention Pruning on Pre-trained Language Models☆19Updated 2 years ago
- simple decoder-only GTP model in pytorch☆41Updated last year
- 通义千问的DPO训练☆50Updated 9 months ago
- The Roadmap for LLMs☆85Updated last year
- An easier PyTorch deep reinforcement learning library.☆228Updated 6 months ago
- ☆151Updated last year
- AM (Advanced Mathematics) Chat is a large language model that integrates advanced mathematical knowledge, exercises in higher mathematics…☆195Updated 11 months ago
- 本项目用于大模型数学解题能力方面 的数据集合成,模型训练及评测,相关文章记录。☆91Updated 10 months ago
- llama inference for tencentpretrain☆98Updated 2 years ago
- chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu☆164Updated last year