jasonvanf / llama-trlLinks
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
☆237Updated 5 months ago
Alternatives and similar repositories for llama-trl
Users that are interested in llama-trl are comparing it to the libraries listed below
Sorting:
- ☆281Updated last year
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆137Updated 8 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆269Updated last year
- A large-scale, fine-grained, diverse preference dataset (and models).☆361Updated 2 years ago
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆583Updated last year
- Generative Judge for Evaluating Alignment☆249Updated 2 years ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆117Updated 2 years ago
- Data and Code for Program of Thoughts [TMLR 2023]☆303Updated last year
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆284Updated 2 years ago
- Datasets for Instruction Tuning of Large Language Models☆260Updated 2 years ago
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆414Updated 7 months ago
- [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning☆511Updated last year
- Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets☆350Updated 2 years ago
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆533Updated last year
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆410Updated last year
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]☆382Updated last year
- ☆340Updated 7 months ago
- ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)☆268Updated last year
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …☆285Updated 2 years ago
- Scripts of LLM pre-training and fine-tuning (w/wo LoRA, DeepSpeed)☆87Updated last year
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆226Updated 5 months ago
- Scripts for fine-tuning Llama2 via SFT and DPO.☆207Updated 2 years ago
- Prod Env☆437Updated 2 years ago
- A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.☆387Updated 2 years ago
- Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them☆544Updated last year
- A paper & resource list of large language models, including course, paper, demo, figures☆200Updated 2 years ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆185Updated 7 months ago
- Direct Preference Optimization from scratch in PyTorch☆124Updated 9 months ago
- This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.☆563Updated last year
- ☆334Updated last year