erxiong0 / ReinforcementLearning-R.S.Links

To reproduce the experiments in Sutton's book

☆13

Alternatives and similar repositories for ReinforcementLearning-R.S.

Users that are interested in ReinforcementLearning-R.S. are comparing it to the libraries listed below

Sorting:

erxiong0 / chichi-gitbook
Build Jekyll site with GitBook style!
☆13Updated last month
erxiong0 / llm_app
make LLM as a private assistant
☆15Updated 2 months ago
HIT-SCIR / Chinese-Mixtral-8x7B
中文Mixtral-8x7B（Chinese-Mixtral-8x7B）
☆650Updated 10 months ago
RapidAI / TableStructureRec
整理目前开源的最优表格识别模型，完善前后处理，模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post…
☆732Updated 2 months ago
ScienceOne-AI / DeepSeek-671B-SFT-Guide
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to infer…
☆703Updated 3 months ago
gomate-community / TrustRAG
TrustRAG：The RAG Framework within Reliable input,Trusted output
☆996Updated 3 weeks ago
opendatalab / magic-html
☆471Updated 3 months ago
charent / Phi2-mini-Chinese
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型，支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
☆552Updated 11 months ago
multimodal-art-projection / MAP-NEO
☆942Updated 4 months ago
SkyworkAI / Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sour…
☆1,389Updated 3 months ago
360AILAB-NLP / 360LayoutAnalysis
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
☆290Updated 9 months ago
FlagOpen / FlagData
☆339Updated last year
liwenju0 / cutword
一个简单快速的分词、命名实体识别工具
☆599Updated 2 months ago
AI-Study-Han / Zero-Chatgpt
从0开始，将chatgpt的技术路线跑一遍。
☆241Updated 9 months ago
Qihoo360 / Light-R1
☆717Updated 3 weeks ago
IEIT-Yuan / Yuan-2.0
Yuan 2.0 Large Language Model
☆684Updated 11 months ago
charent / ChatLM-mini-Chinese
中文对话0.2B小模型（ChatLM-Chinese-0.2B），开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调，给出三元组信息抽取微调示例。
☆1,554Updated last year
FudanDISC / DISC-FinLLM
DISC-FinLLM，中文金融大语言模型（LLM），旨在为用户提供金融场景下专业、智能、全面的金融咨询服务。DISC-FinLLM, a Chinese financial large language model (LLM) designed to provide us…
☆731Updated last year
opendatalab / DocLayout-YOLO
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆1,367Updated 2 months ago
yongzhuo / LLM-SFT
中文大模型微调(LLM-SFT), 数学指令数据集MWP-Instruct, 支持模型(ChatGLM-6B, LLaMA, Bloom-7B, baichuan-7B), 支持(LoRA, QLoRA, DeepSpeed, UI, TensorboardX), 支持(微…
☆203Updated last year
RapidAI / RapidLayout
Analysis of Chinese and English layouts 中英文版面分析
☆218Updated last week
OpenBMB / MiniCPM-CookBook
This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…
☆246Updated this week
wangyuxinwhy / uniem
unified embedding model
☆864Updated last year
jiahe7ay / MINI_LLM
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
☆441Updated last month
Jing-yilin / E2M
E2M API, converting everything to markdown (LLM-friendly Format).
☆136Updated 6 months ago
NovaSearch-Team / RAG-Retrieval
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT, ReRanker.
☆941Updated 2 weeks ago
SpursGoZmy / Tabular-LLM
本项目旨在收集开源的表格智能任务数据集（比如表格问答、表格-文本生成等），将原始数据整理为指令微调格式的数据并微调LLM，进而增强LLM对于表格数据的理解，最终构建出专门面向表格智能任务的大型语言模型。
☆596Updated last year
KylinMountain / graphrag-server
添加🚀流式 Web 服务到 GraphRAG，兼容 OpenAI SDK，支持可访问的实体链接🔗，支持建议问题，兼容本地嵌入模型，修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…
☆254Updated 3 months ago
OpenBMB / VisRAG
Parsing-free RAG supported by VLMs
☆746Updated 4 months ago
RapidAI / RapidTable
基于序列表格识别算法推理库，集成PP-Structure和modelscope等表格识别算法。
☆314Updated this week