jackaduma / Alpaca-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆58Updated 2 years ago
Alternatives and similar repositories for Alpaca-LoRA-RLHF-PyTorch
Users that are interested in Alpaca-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 10 months ago
- Unofficial implementation of AlpaGasus☆92Updated last year
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆150Updated last year
- Code for ACL2023 paper: Pre-Training to Learn in Context☆107Updated 11 months ago
- ☆139Updated last year
- On Transferability of Prompt Tuning for Natural Language Processing☆99Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆85Updated last year
- ☆56Updated 2 years ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆219Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆140Updated 2 months ago
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆244Updated last year
- ☆96Updated 2 years ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆64Updated last year
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated 2 years ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆130Updated 2 years ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆154Updated last year
- ☆172Updated 2 years ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated 2 years ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆35Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆200Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆97Updated last year
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆81Updated 11 months ago
- Self-Alignment with Principle-Following Reward Models☆162Updated 2 months ago
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …☆272Updated last year
- The official repository of "ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models"☆44Updated 2 years ago
- PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models☆109Updated 3 years ago
- Reverse Instructions to generate instruction tuning data with corpus examples☆214Updated last year
- A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆218Updated last year
- Simple next-token-prediction for RLHF☆227Updated last year
- ☆72Updated last year