jackaduma / Vicuna-LoRA-RLHF-PyTorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆213Updated 11 months ago
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- ☆124Updated last year
- llama fine-tuning with lora☆139Updated last year
- [NIPS2023] RRHF & Wombat☆807Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆200Updated last year
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆115Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆135Updated 2 years ago
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆58Updated 2 years ago
- Multi-language Enhanced LLaMA☆301Updated 2 years ago
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆214Updated last year
- Official repository for LongChat and LongEval☆519Updated 11 months ago
- llama2 finetuning with deepspeed and lora☆174Updated last year
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆185Updated last year
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆206Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆250Updated last year
- deep learning☆149Updated last week
- Crosslingual Generalization through Multitask Finetuning☆532Updated 7 months ago
- LOMO: LOw-Memory Optimization☆986Updated 10 months ago
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆97Updated last year
- Naive Bayes-based Context Extension☆326Updated 5 months ago
- A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…☆193Updated 11 months ago
- Large Language Models Are Reasoning Teachers (ACL 2023)☆335Updated 2 months ago
- ☆459Updated 11 months ago
- Open Source WizardCoder Dataset☆158Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆219Updated last year
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆206Updated last year
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆139Updated 2 years ago
- [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition☆633Updated 9 months ago
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆192Updated last year