erxiong0 / ReinforcementLearning-R.S.Links
To reproduce the experiments in Sutton's book
☆14Updated 4 months ago
Alternatives and similar repositories for ReinforcementLearning-R.S.
Users that are interested in ReinforcementLearning-R.S. are comparing it to the libraries listed below
Sorting:
- Build Jekyll site with GitBook style!☆14Updated 2 months ago
- 中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。☆1,573Updated last year
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆1,470Updated this week
- Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.☆563Updated last year
- 通义千问VLLM推理部署DEMO☆595Updated last year
- 这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。☆546Updated 5 months ago
- [ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.☆1,840Updated 7 months ago
- 本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。☆609Updated last year
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post…☆785Updated last week
- 用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.☆2,838Updated last year
- Netease Youdao's open-source embedding and reranker models for RAG products.☆1,818Updated last month
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …☆5,810Updated last week
- This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.☆460Updated 3 months ago
- 从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)☆462Updated 4 months ago
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,832Updated this week
- Train a 1B LLM with 1T tokens from scratch by personal☆707Updated 3 months ago
- 从0开始,将chatgpt的技术路线跑一遍。☆250Updated 11 months ago
- ☆1,006Updated last week
- unified embedding model☆866Updated last year
- ChatGPT中文学习和实践资料汇总——LLaMA、ChatGLM等大模型的Finetune☆14Updated 2 years ago
- 中文nlp解决方案(大模型、数据、模型、训练、推理)☆3,595Updated last week
- ☆1,842Updated 8 months ago
- 轩辕:度小满中文金融对话大模型☆1,247Updated 7 months ago
- An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to infer…☆743Updated 4 months ago
- 一个简单快速的分词、命名实体识别工具☆606Updated 2 weeks ago
- datasets resource☆117Updated last month
- Reproduce R1 Zero on Logic Puzzle☆2,384Updated 4 months ago
- ☆953Updated 6 months ago
- FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。☆2,041Updated last year
- 开源SFT数据集整理,随时补充☆533Updated 2 years ago