jackaduma / Vicuna-LoRA-RLHF-PyTorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆211Updated 9 months ago
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch:
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
- [NIPS2023] RRHF & Wombat☆799Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆200Updated last year
- ☆456Updated 8 months ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆134Updated last year
- ☆122Updated last year
- llama fine-tuning with lora☆140Updated 9 months ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆112Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆242Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆194Updated last year
- ☆728Updated 8 months ago
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆58Updated last year
- Official repository for LongChat and LongEval☆519Updated 8 months ago
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆215Updated last year
- Naive Bayes-based Context Extension☆320Updated 2 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆185Updated last year
- Multi-language Enhanced LLaMA☆301Updated last year
- llama2 finetuning with deepspeed and lora☆172Updated last year
- ☆268Updated last year
- Crosslingual Generalization through Multitask Finetuning☆525Updated 4 months ago
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆203Updated last year
- deep learning☆150Updated 7 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆389Updated 9 months ago
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆229Updated this week
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆96Updated 9 months ago
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆205Updated last year
- Large Language Models Are Reasoning Teachers (ACL 2023)☆320Updated last year
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆226Updated last year
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆136Updated last year
- ☆456Updated last year