erxiong0 / ReinforcementLearning-R.S.Links
To reproduce the experiments in Sutton's book
☆13Updated 2 months ago
Alternatives and similar repositories for ReinforcementLearning-R.S.
Users that are interested in ReinforcementLearning-R.S. are comparing it to the libraries listed below
Sorting:
- Build Jekyll site with GitBook style!☆13Updated last week
- make LLM as a private assistant☆15Updated last month
- ☆939Updated 4 months ago
- DISC-FinLLM,中文金融大语言模型(LLM),旨在为用户提供金融场景下专业、智能、全面的金融咨询服务。DISC-FinLLM, a Chinese financial large language model (LLM) designed to provide us…☆728Updated last year
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding☆2,184Updated last week
- Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT, ReRanker.☆918Updated this week
- FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。☆1,991Updated last year
- Distributed RL System for LLM Reasoning☆1,544Updated this week
- A pre-built agent for TableGPT2.☆581Updated 2 months ago
- 本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。☆587Updated last year
- Reproduce R1 Zero on Logic Puzzle☆2,350Updated 2 months ago
- 企业级RAG系统从入门到精通☆478Updated 2 months ago
- Easy-to-Use RAG Framework; CCF AIOps International Challenge 2024 Top3 Solution; CCF AIOps 国际挑战赛 2024 季军方案☆490Updated 6 months ago
- 通义千问VLLM推理部署DEMO☆580Updated last year
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post…☆709Updated 2 months ago
- 从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)☆439Updated 2 months ago
- [ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.☆1,834Updated 5 months ago
- 复现大模型相关算法及 一些学习记录☆1,575Updated 2 weeks ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆282Updated 8 months ago
- 尝试复现S1☆13Updated 2 months ago
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆3,350Updated last week
- A very simple GRPO implement for reproducing r1-like LLM thinking.☆1,096Updated 2 months ago
- The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.☆1,109Updated last week
- 从零实现一个小参数量中文大语言模型。☆665Updated 9 months ago
- My learning notes/codes for ML SYS.☆2,337Updated last week
- FinQwen: 致力于构建一个开放、稳定、高质量的金融大模型项目,基于大模型搭建金融场景智能问答系统,利用开源开放来促进「AI+金融」。☆384Updated 11 months ago
- Netease Youdao's open-source embedding and reranker models for RAG products.☆1,767Updated 4 months ago
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models☆1,781Updated 4 months ago
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆2,570Updated this week
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆1,076Updated this week