schinger / FullLLMLinks
Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)
☆30Updated 10 months ago
Alternatives and similar repositories for FullLLM
Users that are interested in FullLLM are comparing it to the libraries listed below
Sorting:
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆116Updated 2 years ago
- Reinforcement Learning in LLM and NLP.☆62Updated this week
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆66Updated 10 months ago
- ☆147Updated last year
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated last month
- a-m-team's exploration in large language modeling☆195Updated 7 months ago
- The related works and background techniques about Openai o1☆221Updated 11 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆275Updated 10 months ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆283Updated 2 years ago
- Fantastic Data Engineering for Large Language Models☆93Updated last year
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆79Updated 8 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆148Updated 2 months ago
- 基于DPO算法微调语言大模型,简单好上手。☆48Updated last year
- 怎么训练一个LLM分词器☆154Updated 2 years ago
- 在verl上做reward的定制开发☆140Updated 7 months ago
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆159Updated last year
- ☆36Updated last year
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆285Updated 2 years ago
- pytorch分布式训练☆73Updated 2 years ago
- ☆147Updated last year
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆36Updated 7 months ago
- Collection of papers for scalable automated alignment.☆94Updated last year
- ☆47Updated 10 months ago
- ☆115Updated last year
- ☆130Updated last year
- A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks☆270Updated last year
- ☆82Updated last month
- OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards☆113Updated last week
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆412Updated 6 months ago
- ☆87Updated 2 years ago