hscspring / rl-llm-nlpLinks
Reinforcement Learning in LLM and NLP.
☆39Updated last week
Alternatives and similar repositories for rl-llm-nlp
Users that are interested in rl-llm-nlp are comparing it to the libraries listed below
Sorting:
- ☆124Updated last year
- ☆141Updated last year
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆91Updated 9 months ago
- a-m-team's exploration in large language modeling☆161Updated 3 weeks ago
- ☆142Updated 11 months ago
- Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥☆262Updated 5 months ago
- The related works and background techniques about Openai o1☆223Updated 5 months ago
- ☆241Updated 2 weeks ago
- A curated list of awesome works in Routing LLMs paradigm (👉 Welcome to submit your contributions to this code repository)☆40Updated last month
- 使用单个24G显卡,从0开始训练LLM☆56Updated last month
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆82Updated 10 months ago
- ☆109Updated 7 months ago
- Fantastic Data Engineering for Large Language Models☆89Updated 6 months ago
- This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the…☆115Updated last month
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆33Updated 3 weeks ago
- 基于DPO算法微调语言大模型,简单好上手。☆39Updated 11 months ago
- 怎么训练一个LLM分词器☆150Updated last year
- ☆82Updated last year
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆216Updated 4 months ago
- ☆97Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆81Updated 7 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆160Updated this week
- ☆82Updated last year
- ☆170Updated last year
- ☆152Updated last month
- 在verl上做reward的定制开发☆56Updated last month
- An Awesome List of Reinforcement Learning-based Large Language Agent Works. Collect directly from official code base.☆154Updated this week
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆62Updated 4 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆127Updated this week
- A Comprehensive Survey on Long Context Language Modeling☆152Updated 3 weeks ago