jackaduma / Vicuna-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆218Updated last year
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- [NIPS2023] RRHF & Wombat☆811Updated 2 years ago
- llama fine-tuning with lora☆139Updated last year
- ☆124Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆138Updated 2 years ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆299Updated 2 years ago
- Large Language Models Are Reasoning Teachers (ACL 2023)☆341Updated 7 months ago
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆210Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated 2 years ago
- ☆327Updated last year
- Naive Bayes-based Context Extension☆325Updated 10 months ago
- Scripts for fine-tuning Llama2 via SFT and DPO.☆203Updated 2 years ago
- ☆460Updated last year
- Official repository for LongChat and LongEval☆532Updated last year
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆60Updated 2 years ago
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆173Updated 2 years ago
- Multi-language Enhanced LLaMA☆303Updated 2 years ago
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated last year
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆273Updated 7 months ago
- llama2 finetuning with deepspeed and lora☆176Updated 2 years ago
- Datasets for Instruction Tuning of Large Language Models☆257Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆224Updated last year
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated 2 years ago
- LongQLoRA: Extent Context Length of LLMs Efficiently☆166Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆139Updated 5 months ago
- ☆763Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆229Updated last month
- ☆922Updated last year
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆142Updated 2 years ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆115Updated 2 years ago
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]☆376Updated last year