Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
☆564May 9, 2024Updated last year
Alternatives and similar repositories for TextRL
Users that are interested in TextRL are comparing it to the libraries listed below
Sorting:
- A modular RL library to fine-tune language models to human preferences☆2,382Mar 1, 2024Updated 2 years ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,739Jan 8, 2024Updated 2 years ago
- A (somewhat) minimal library for finetuning language models with PPO on human feedback.☆90Nov 23, 2022Updated 3 years ago
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆174Apr 7, 2023Updated 2 years ago
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Train transformer language models with reinforcement learning.☆17,697Updated this week
- A publishing website of a table collecting meta-learning-related papers in the area of human language processing.☆17Aug 2, 2021Updated 4 years ago
- Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD☆23Dec 13, 2022Updated 3 years ago
- ☆98May 30, 2023Updated 2 years ago
- 🤖📇 handling multiple nlp task in one pipeline☆57Sep 18, 2025Updated 6 months ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆307Mar 1, 2023Updated 3 years ago
- [NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation☆475Mar 7, 2024Updated 2 years ago
- Code for the paper Fine-Tuning Language Models from Human Preferences☆1,381Jul 25, 2023Updated 2 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- ☆13Sep 25, 2024Updated last year
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- 🏃 hosting nlp models in one line☆20May 8, 2024Updated last year
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,827Jun 17, 2025Updated 9 months ago
- Official Github repo for the paper "Evaluating the Evaluation of Diversity in Natural Language Generation"☆21Feb 23, 2021Updated 5 years ago
- one script for xls-r/xlsr/whisper fine-tuning☆42Jun 29, 2023Updated 2 years ago
- A Unified Library for Parameter-Efficient and Modular Transfer Learning☆2,804Mar 1, 2026Updated 2 weeks ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆46Jul 3, 2025Updated 8 months ago
- Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Textual Style Transfer☆36Oct 2, 2022Updated 3 years ago
- Code for "Learning to summarize from human feedback"☆1,062Sep 5, 2023Updated 2 years ago
- Diffusion-LM☆1,229Aug 8, 2024Updated last year
- A curated list of reinforcement learning with human feedback resources (continually updated)☆4,331Dec 9, 2025Updated 3 months ago
- ☆26Nov 21, 2022Updated 3 years ago
- ☆35Nov 17, 2021Updated 4 years ago
- [NIPS2023] RRHF & Wombat☆808Sep 22, 2023Updated 2 years ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆180Feb 13, 2024Updated 2 years ago
- 🍳 NLPrep - dataset tool for many natural language processing task☆28Jul 30, 2021Updated 4 years ago
- 聯發創新基地(MediaTek Research) 致力於研究基礎模型。我們將研究體現在適合繁體中文使用者的模型上,並在使用權許可的情況下,提供模型給學術界研究或產業界使用。☆266Sep 8, 2025Updated 6 months ago
- Instruction Tuning with GPT-4☆4,338Jun 11, 2023Updated 2 years ago
- ☆34Mar 25, 2023Updated 2 years ago
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- Revolutionize your development workflow with AI-powered code assistance, automating mock tests, suggestions, and unit test generation in …☆34Feb 27, 2025Updated last year
- Aligning pretrained language models with instruction data generated by themselves.☆4,587Mar 27, 2023Updated 2 years ago
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Aug 10, 2023Updated 2 years ago