jackaduma / Alpaca-LoRA-RLHF-PyTorch
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆58Updated last year
Alternatives and similar repositories for Alpaca-LoRA-RLHF-PyTorch:
Users that are interested in Alpaca-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
- ☆96Updated last year
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 5 months ago
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆83Updated last year
- Unofficial implementation of AlpaGasus☆90Updated last year
- Code for ACL2023 paper: Pre-Training to Learn in Context☆108Updated 7 months ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆137Updated 8 months ago
- Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?☆56Updated last year
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆66Updated 6 months ago
- Code for paper 'Data-Efficient FineTuning'☆29Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- Lightweight tool to identify Data Contamination in LLMs evaluation☆46Updated 11 months ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆129Updated last year
- ☆160Updated last year
- This respository contains the code for extracting the test samples we used in our paper: "A Multitask, Multilingual, Multimodal Evaluatio…☆77Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆54Updated last year
- ☆67Updated last year
- Contrastive Chain-of-Thought Prompting☆58Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆43Updated 3 months ago
- ☆35Updated last year
- ☆138Updated last year
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆53Updated 6 months ago
- ☆96Updated 5 months ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆32Updated last year
- Simple next-token-prediction for RLHF☆222Updated last year
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆72Updated 8 months ago
- On Transferability of Prompt Tuning for Natural Language Processing☆97Updated 10 months ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆63Updated last year
- ☆172Updated last year
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"☆57Updated 2 years ago