jackaduma / Vicuna-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆219Updated last year
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- llama fine-tuning with lora☆138Updated last year
- [NIPS2023] RRHF & Wombat☆812Updated last year
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆138Updated 2 years ago
- ☆124Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆203Updated 2 years ago
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆210Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆228Updated last month
- Naive Bayes-based Context Extension☆326Updated 9 months ago
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆98Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated 2 years ago
- Multi-language Enhanced LLaMA☆302Updated 2 years ago
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆270Updated 6 months ago
- deep learning☆149Updated 4 months ago
- ☆325Updated last year
- ☆460Updated last year
- llama2 finetuning with deepspeed and lora☆176Updated 2 years ago
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆60Updated 2 years ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated 2 years ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆114Updated 2 years ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆139Updated 4 months ago
- Large Language Models Are Reasoning Teachers (ACL 2023)☆341Updated 6 months ago
- ☆281Updated last year
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆173Updated 2 years ago
- Open efforts to implement ChatGPT-like models and beyond.☆109Updated last year
- Crosslingual Generalization through Multitask Finetuning☆535Updated 11 months ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated 2 years ago
- Datasets for Instruction Tuning of Large Language Models☆255Updated last year
- Evaluating LLMs' multi-round chatting capability via assessing conversations generated by two LLM instances.☆156Updated 3 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆399Updated last year
- A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…☆202Updated last year