jackaduma / Vicuna-LoRA-RLHF-PyTorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆213Updated 11 months ago
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch:
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
- [NIPS2023] RRHF & Wombat☆806Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆135Updated last year
- Official repository for LongChat and LongEval☆519Updated 11 months ago
- llama fine-tuning with lora☆139Updated 11 months ago
- All available datasets for Instruction Tuning of Large Language Models☆248Updated last year
- ☆459Updated last year
- ☆124Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆212Updated last year
- llama2 finetuning with deepspeed and lora☆174Updated last year
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆115Updated last year
- Multi-language Enhanced LLaMA☆301Updated 2 years ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated last year
- ☆459Updated 10 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆205Updated last year
- ☆913Updated 11 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆395Updated 11 months ago
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆545Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆629Updated last year
- [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition☆628Updated 9 months ago
- LOMO: LOw-Memory Optimization☆985Updated 9 months ago
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆138Updated 2 years ago
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆226Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆197Updated last year
- deep learning☆149Updated last month
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆173Updated 2 years ago
- ☆270Updated 2 years ago
- Large Language Models Are Reasoning Teachers (ACL 2023)☆332Updated last month
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆97Updated last year
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆169Updated last year