jackaduma / Vicuna-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆219Updated last year
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- [NIPS2023] RRHF & Wombat☆811Updated 2 years ago
- ☆123Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆139Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated 2 years ago
- llama fine-tuning with lora☆140Updated last year
- Multi-language Enhanced LLaMA☆303Updated 2 years ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆299Updated 2 years ago
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆212Updated last year
- ☆459Updated last year
- Large Language Models Are Reasoning Teachers (ACL 2023)☆341Updated 7 months ago
- Official repository for LongChat and LongEval☆531Updated last year
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆172Updated 2 years ago
- Crosslingual Generalization through Multitask Finetuning☆537Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆204Updated 2 years ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆115Updated 2 years ago
- Naive Bayes-based Context Extension☆324Updated 10 months ago
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆231Updated 2 months ago
- llama2 finetuning with deepspeed and lora☆176Updated 2 years ago
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆60Updated 2 years ago
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆351Updated 2 years ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated 2 years ago
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated 2 years ago
- Datasets for Instruction Tuning of Large Language Models☆257Updated last year
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆98Updated last year
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆142Updated 2 years ago
- ☆330Updated last year
- ☆767Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆826Updated last year
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆401Updated last year
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆967Updated last year