jackaduma / Vicuna-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆219Updated last year
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- [NIPS2023] RRHF & Wombat☆811Updated last year
- ☆124Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆138Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- llama fine-tuning with lora☆138Updated last year
- ☆460Updated last year
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆301Updated 2 years ago
- Large Language Models Are Reasoning Teachers (ACL 2023)☆341Updated 5 months ago
- Naive Bayes-based Context Extension☆326Updated 8 months ago
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆60Updated 2 years ago
- Multi-language Enhanced LLaMA☆302Updated 2 years ago
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆226Updated last week
- Scripts for fine-tuning Llama2 via SFT and DPO.☆203Updated 2 years ago
- ☆324Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆224Updated last year
- Official repository for LongChat and LongEval☆528Updated last year
- deep learning☆149Updated 3 months ago
- A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…☆203Updated last year
- Crosslingual Generalization through Multitask Finetuning☆535Updated 11 months ago
- Datasets for Instruction Tuning of Large Language Models☆255Updated last year
- ☆281Updated last year
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆210Updated last year
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆114Updated 2 years ago
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆142Updated 2 years ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆398Updated last year
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆353Updated 2 years ago
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆98Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated last year
- Evaluating LLMs' multi-round chatting capability via assessing conversations generated by two LLM instances.☆156Updated 3 months ago