jackaduma / Vicuna-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆216Updated last year
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- llama fine-tuning with lora☆139Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆218Updated 2 years ago
- [NIPS2023] RRHF & Wombat☆808Updated last year
- Official repository for LongChat and LongEval☆521Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆201Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆252Updated last year
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆171Updated 2 years ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆114Updated 2 years ago
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆58Updated 2 years ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆136Updated 2 years ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆185Updated 2 years ago
- llama2 finetuning with deepspeed and lora☆175Updated last year
- ☆96Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆140Updated last month
- ☆124Updated last year
- ☆459Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆221Updated last year
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆546Updated last year
- Crosslingual Generalization through Multitask Finetuning☆536Updated 9 months ago
- Multi-language Enhanced LLaMA☆301Updated 2 years ago
- Open Source WizardCoder Dataset☆158Updated last year
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated 2 years ago
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆348Updated 2 years ago
- Simple next-token-prediction for RLHF☆227Updated last year
- minichatgpt - To Train ChatGPT In 5 Minutes☆168Updated last year
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆227Updated 2 years ago
- Naive Bayes-based Context Extension☆326Updated 6 months ago
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆171Updated 2 years ago
- ☆270Updated 2 years ago