jackaduma / Vicuna-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
☆220Updated last year
Alternatives and similar repositories for Vicuna-LoRA-RLHF-PyTorch
Users that are interested in Vicuna-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- [NIPS2023] RRHF & Wombat☆809Updated 2 years ago
- llama fine-tuning with lora☆140Updated last year
- ☆123Updated 2 years ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆140Updated 2 years ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆299Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated 2 years ago
- ☆333Updated last year
- ☆459Updated last year
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆273Updated 9 months ago
- Official repository for LongChat and LongEval☆533Updated last year
- Large Language Models Are Reasoning Teachers (ACL 2023)☆343Updated 9 months ago
- ☆770Updated last year
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆60Updated 2 years ago
- Open efforts to implement ChatGPT-like models and beyond.☆107Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆840Updated last year
- Naive Bayes-based Context Extension☆326Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆206Updated 2 years ago
- ☆282Updated last year
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated 2 years ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆409Updated last year
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆212Updated last year
- Crosslingual Generalization through Multitask Finetuning☆537Updated last year
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆177Updated 2 years ago
- ☆922Updated last year
- Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools☆144Updated 2 years ago
- Multi-language Enhanced LLaMA☆303Updated 2 years ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆138Updated 7 months ago
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆354Updated 2 years ago
- llama2 finetuning with deepspeed and lora☆175Updated 2 years ago
- Datasets for Instruction Tuning of Large Language Models☆260Updated 2 years ago