sun1638650145 / deep-rl-class-zhLinks
Hugging Face 深度强化学习课程(中文版)
☆21Updated 2 years ago
Alternatives and similar repositories for deep-rl-class-zh
Users that are interested in deep-rl-class-zh are comparing it to the libraries listed below
Sorting:
- An easier PyTorch deep reinforcement learning library.☆240Updated 10 months ago
- ☆164Updated last year
- SuperCLUE琅琊榜:中文通用大模型匿名对战评价基准☆145Updated last year
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆272Updated 7 months ago
- 通过动画学强化学习笔记☆59Updated 8 months ago
- 千问14B和7B的逐行解释☆63Updated 2 years ago
- 本项目致力于为大模型领域的初学者提供全面的知识体系,包括基础和高阶内容,以便开发者能迅速掌握大模型技术栈并全面了解相关知识。☆61Updated 9 months ago
- Awesome Colab Projects Collection☆29Updated last year
- qwen models finetuning☆105Updated 7 months ago
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆95Updated last year
- The Roadmap for LLMs☆86Updated 2 years ago
- 演示Gemma中文指令微调的教程☆46Updated last year
- ☆194Updated 8 months ago
- 大语言模型训练和服务调研☆36Updated 2 years ago
- deep learning☆148Updated 5 months ago
- ☆74Updated last year
- DSPy中文文档☆42Updated last year
- 解锁HuggingFace生态的百般用法☆94Updated 10 months ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆139Updated 2 years ago
- A repo for update and debug Mixtral-7x8B、MOE、ChatGLM3、LLaMa2、 BaChuan、Qwen an other LLM models include new models mixtral, mixtral 8x7b, …☆47Updated 3 weeks ago
- 大型语言模型实战指南:应用实践与场景落地☆80Updated last year
- 中文基于满血DeepSeek-R1蒸馏数据集☆62Updated 8 months ago
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆55Updated 2 years ago
- Baichuan2代码的逐行解析版本,适合小白☆214Updated 2 years ago
- 首个llama2 13b 中文版模型 (Base + 中文对话SFT,实现流畅多轮人机自然语言交互)☆91Updated 2 years ago
- LoRA☆18Updated 2 years ago
- ☆270Updated this week
- Alpaca Chinese Dataset -- 中文指令微调数据集☆217Updated last year
- Gemma-SFT, gemma-2b/gemma-7b微调(finetune,transformers)/LORA(peft)/推理(inference)☆33Updated last year
- pytorch分布式训练☆71Updated 2 years ago