l294265421 / alpaca-rlhf
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
☆106Updated last year
Related projects: ⓘ
- ☆109Updated 5 months ago
- ☆82Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆207Updated 9 months ago
- MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING☆86Updated 5 months ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆128Updated 2 months ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆196Updated last year
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆281Updated last week
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆37Updated 6 months ago
- ☆156Updated last year
- Code for "Lion: Adversarial Distillation of Proprietary Large Language Models (EMNLP 2023)"☆195Updated 7 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆208Updated last week
- Naive Bayes-based Context Extension☆310Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆61Updated 4 months ago
- 怎么训练一个LLM分词器☆123Updated last year
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆90Updated 7 months ago
- llama2 finetuning with deepspeed and lora☆159Updated last year
- ☆148Updated 10 months ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆67Updated last year
- ☆265Updated 4 months ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆123Updated last year
- ☆89Updated 9 months ago
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆177Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆105Updated 3 months ago
- 中文大语言模型评测第二期☆68Updated 10 months ago
- A Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark☆96Updated last year
- deepspeed+trainer简单高效实现多卡微调大模型☆115Updated last year
- 中文大语言模型评测第一期☆105Updated 10 months ago
- ☆124Updated 2 months ago
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆244Updated last week
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆133Updated 3 months ago