jackaduma / Alpaca-LoRA-RLHF-PyTorch
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆55Updated last year
Related projects: ⓘ
- ☆94Updated last year
- Unofficial implementation of AlpaGasus☆83Updated 11 months ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated last week
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆128Updated 2 months ago
- ☆131Updated last year
- Code for ACL2023 paper: Pre-Training to Learn in Context☆106Updated last month
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆62Updated 9 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆151Updated 4 months ago
- Plug in and play implementation of " Textbooks Are All You Need", ready for training, inference, and dataset generation☆75Updated last year
- Self-Alignment with Principle-Following Reward Models☆144Updated 6 months ago
- ☆105Updated this week
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆133Updated 10 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆73Updated 7 months ago
- ☆52Updated 7 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆134Updated 6 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…