jackaduma / Alpaca-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
☆58Updated 2 years ago
Alternatives and similar repositories for Alpaca-LoRA-RLHF-PyTorch
Users that are interested in Alpaca-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- ☆96Updated 2 years ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆64Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆140Updated last month
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 9 months ago
- Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?☆56Updated 2 years ago
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆57Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆85Updated last year
- ☆172Updated 2 years ago
- ☆66Updated 3 years ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆35Updated last year
- Unofficial implementation of AlpaGasus☆91Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆28Updated last year
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆219Updated last year
- About The corresponding code from our paper " REFINER: Reasoning Feedback on Intermediate Representations" (EACL 2024). Do not hesitate t…☆70Updated last year
- ☆35Updated last year
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆53Updated 9 months ago
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.☆131Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"☆56Updated 3 years ago
- ⚡Research papers about leveraging the capabilities of language models⚡☆52Updated 2 years ago
- The official repository of "ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models"☆43Updated 2 years ago
- realize the reinforcement learning training for gpt2 llama bloom and so on llm model☆25Updated last year
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆162Updated last year
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆160Updated last year
- ☆68Updated 2 years ago
- Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"☆47Updated last year
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆114Updated 11 months ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 5 months ago