schinger / FullLLMLinks
Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)
☆30Updated 11 months ago
Alternatives and similar repositories for FullLLM
Users that are interested in FullLLM are comparing it to the libraries listed below
Sorting:
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆117Updated 2 years ago
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆67Updated 11 months ago
- Reinforcement Learning in LLM and NLP.☆62Updated last month
- The related works and background techniques about Openai o1☆221Updated last year
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆79Updated 9 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆153Updated 3 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated 2 months ago
- a-m-team's exploration in large language modeling☆195Updated 8 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆283Updated 11 months ago
- 在verl上做reward的定制开发☆144Updated 8 months ago
- 基于DPO算法微调语言大模型,简单好上手。☆50Updated last year
- Fantastic Data Engineering for Large Language Models☆93Updated last year
- ☆147Updated last year
- ☆115Updated last year
- Collection of papers for scalable automated alignment.☆93Updated last year
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆160Updated last year
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆266Updated 2 years ago
- ☆41Updated 11 months ago
- Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework☆205Updated 3 weeks ago
- ☆130Updated last year
- ☆147Updated last year
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆284Updated 2 years ago
- Heuristic filtering framework for RefineCode☆82Updated 10 months ago
- 怎么训练一个LLM分词器☆153Updated 2 years ago
- 使用单个24G显卡,从0开始训练LLM☆56Updated 6 months ago
- pytorch分布式训练☆73Updated 2 years ago
- A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks☆272Updated last year
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆284Updated 2 years ago
- ☆163Updated last year
- ☆87Updated 2 years ago