jackaduma / Vicuna-LoRA-RLHF-PyTorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆208Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for Vicuna-LoRA-RLHF-PyTorch
- [NIPS2023] RRHF & Wombat☆798Updated last year
- ☆453Updated 5 months ago
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆207Updated last year
- ☆120Updated 11 months ago
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆201Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆126Updated last year
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆106Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆185Updated last year
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆344Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆782Updated 4 months ago
- llama fine-tuning with lora☆137Updated 6 months ago
- Naive Bayes-based Context Extension☆313Updated last year
- Code for "Lion: Adversarial Distillation of Proprietary Large Language Models (EMNLP 2023)"☆201Updated 9 months ago
- Multi-language Enhanced LLaMA☆301Updated last year
- ☆273Updated 6 months ago
- deep learning☆149Updated 4 months ago
- Scripts for fine-tuning Llama2 via SFT and DPO.☆182Updated last year
- Open efforts to implement ChatGPT-like models and beyond.☆105Updated 3 months ago
- FireAct: Toward Language Agent Fine-tuning☆255Updated last year
- A large-scale, fine-grained, diverse preference dataset (and models).☆315Updated 10 months ago
- The open source implementation of ChatGPT, Alpaca, Vicuna and RLHF Pipeline. 从0开始实现一个ChatGPT.☆175Updated 5 months ago
- LOMO: LOw-Memory Optimization☆979Updated 4 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆175Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆237Updated 11 months ago
- ☆708Updated 5 months ago
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆136Updated last year
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)☆332Updated 2 months ago
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆528Updated 8 months ago
- Implementation of Chinese ChatGPT☆287Updated last year
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆296Updated last year