jackaduma / Alpaca-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆60Updated 2 years ago
Alternatives and similar repositories for Alpaca-LoRA-RLHF-PyTorch
Users that are interested in Alpaca-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- Code for ACL2023 paper: Pre-Training to Learn in Context☆107Updated last year
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆153Updated last year
- Unofficial implementation of AlpaGasus☆92Updated last year
- On Transferability of Prompt Tuning for Natural Language Processing☆100Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆87Updated last year
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆131Updated 2 years ago
- ☆140Updated 2 years ago
- ☆173Updated 2 years ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆139Updated 4 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆219Updated 2 years ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆64Updated last year
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated 2 years ago
- Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"☆245Updated last year
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆78Updated last year
- [NAACL 2024] Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models☆86Updated last year
- A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆219Updated last year
- ⏳ ChatLog: Recording and Analysing ChatGPT Across Time☆102Updated last year
- This repository is the official implementation of our paper MVP: Multi-task Supervised Pre-training for Natural Language Generation.☆73Updated 2 years ago
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆247Updated last year
- ☆74Updated last year
- ☆35Updated 2 years ago
- Reverse Instructions to generate instruction tuning data with corpus examples☆215Updated last year
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆190Updated last year
- Self-Alignment with Principle-Following Reward Models☆165Updated 4 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated 2 years ago
- ☆180Updated 2 years ago
- Scripts for fine-tuning Llama2 via SFT and DPO.☆203Updated 2 years ago
- llama fine-tuning with lora☆138Updated last year
- Counting-Stars (★)☆83Updated 3 months ago
- PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models☆109Updated 3 years ago