jackaduma / Alpaca-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆60Updated 2 years ago
Alternatives and similar repositories for Alpaca-LoRA-RLHF-PyTorch
Users that are interested in Alpaca-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- Code for ACL2023 paper: Pre-Training to Learn in Context☆106Updated last year
- On Transferability of Prompt Tuning for Natural Language Processing☆100Updated last year
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆78Updated last year
- Unofficial implementation of AlpaGasus☆93Updated 2 years ago
- ☆142Updated 2 years ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆89Updated last year
- ☆173Updated 2 years ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆165Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆251Updated 2 years ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆64Updated 2 years ago
- ☆98Updated 2 years ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆132Updated 2 years ago
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated 2 years ago
- [ICLR 2023] Codebase for Copy-Generator model, including an implementation of kNN-LM☆189Updated 10 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆156Updated 2 years ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆220Updated 2 years ago
- This repository is the official implementation of our paper MVP: Multi-task Supervised Pre-training for Natural Language Generation.☆73Updated 3 years ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆138Updated 7 months ago
- ☆35Updated 2 years ago
- [NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…☆76Updated last year
- PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models☆110Updated 3 years ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆100Updated 2 years ago
- Self-Alignment with Principle-Following Reward Models☆169Updated 2 months ago
- ☆75Updated last year
- Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"☆248Updated last year
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"☆168Updated 4 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated 2 years ago
- Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"☆96Updated last year
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆55Updated last year
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆117Updated 2 years ago