erxiong0 / ReinforcementLearning-R.S.Links
To reproduce the experiments in Sutton's book
☆14Updated 7 months ago
Alternatives and similar repositories for ReinforcementLearning-R.S.
Users that are interested in ReinforcementLearning-R.S. are comparing it to the libraries listed below
Sorting:
- Build Jekyll site with GitBook style!☆14Updated 5 months ago
 - 中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。☆1,624Updated last year
 - 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post…☆864Updated 3 months ago
 - The Open-Source Data Annotation Platform☆946Updated 8 months ago
 - FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。☆2,096Updated last year
 - 本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。☆623Updated last year
 - 用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.☆2,863Updated last year
 - ☆543Updated last year
 - Netease Youdao's open-source embedding and reranker models for RAG products.☆1,841Updated last month
 - 这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。☆650Updated 8 months ago
 - A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,789Updated 6 months ago
 - A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.☆1,853Updated this week
 - ☆1,111Updated last month
 - ☆1,850Updated 11 months ago
 - Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.☆575Updated last year
 - 从0开始,将chatgpt的技术路线跑一遍。☆264Updated last year
 - Train a 1B LLM with 1T tokens from scratch by personal☆741Updated 6 months ago
 - [ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.☆1,874Updated 10 months ago
 - ☆964Updated 8 months ago
 - 通义千问VLLM推理部署DEMO☆617Updated last year
 - LLM&VLM Tutorial☆1,896Updated 5 months ago
 - [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,111Updated last week
 - Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,976Updated 8 months ago
 - Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT, ReRanker.☆1,048Updated 3 months ago
 - Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.☆1,016Updated last year
 - 开源SFT数据集整理,随时补充☆552Updated 2 years ago
 - 基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等☆2,774Updated last year
 - 从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)☆498Updated 7 months ago
 - Reproduce R1 Zero on Logic Puzzle☆2,408Updated 7 months ago
 - Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集☆3,056Updated last year