jackaduma / Alpaca-LoRA-RLHF-PyTorch
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆58Updated 2 years ago
Alternatives and similar repositories for Alpaca-LoRA-RLHF-PyTorch
Users that are interested in Alpaca-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- ☆138Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆85Updated last year
- Unofficial implementation of AlpaGasus☆91Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆139Updated last week
- Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?☆56Updated 2 years ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 8 months ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆129Updated 2 years ago
- On Transferability of Prompt Tuning for Natural Language Processing☆99Updated last year
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆70Updated 9 months ago
- Code for ACL2023 paper: Pre-Training to Learn in Context☆108Updated 9 months ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆63Updated last year
- ☆97Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆46Updated 5 months ago
- This repository is the official implementation of our paper MVP: Multi-task Supervised Pre-training for Natural Language Generation.☆72Updated 2 years ago
- ☆69Updated last year
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated 2 years ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆34Updated last year
- [ACL 2023] Code and Data Repo for Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"☆53Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆200Updated last year
- A Multi-Turn Dialogue Corpus based on Alpaca Instructions☆171Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆240Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated last year
- ☆35Updated last year
- Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"☆47Updated last year
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.☆130Updated last year
- ☆98Updated 7 months ago
- Source code for the paper "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data"☆20Updated last year
- A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…☆213Updated 11 months ago
- The "GPT-API-Accelerate" project provides a set of Python classes for accelerating the process of generating responses to prompts using t…☆23Updated 7 months ago