jackaduma / Vicuna-LoRA-RLHF-PyTorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆212Updated 10 months ago
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch:
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
- [NIPS2023] RRHF & Wombat☆804Updated last year
- ☆124Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆134Updated last year
- Naive Bayes-based Context Extension☆321Updated 3 months ago
- [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition☆619Updated 8 months ago
- llama fine-tuning with lora☆140Updated 10 months ago
- ☆459Updated 9 months ago
- Official repository for LongChat and LongEval☆516Updated 10 months ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆114Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆206Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆137Updated 9 months ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆247Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆216Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆195Updated last year
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆393Updated 10 months ago
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆204Updated last year
- Multi-language Enhanced LLaMA☆301Updated last year
- ☆311Updated 9 months ago
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)☆364Updated 7 months ago
- ☆268Updated last year
- [ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.☆398Updated 3 months ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆801Updated 8 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated last year
- Large Language Models Are Reasoning Teachers (ACL 2023)☆327Updated 2 weeks ago
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆169Updated last year
- LOMO: LOw-Memory Optimization☆981Updated 8 months ago
- ☆355Updated 2 years ago