Notes and commented code for RLHF (PPO)
☆126Feb 27, 2024Updated 2 years ago
Alternatives and similar repositories for rlhf-ppo
Users that are interested in rlhf-ppo are comparing it to the libraries listed below
Sorting:
- Notes on Direct Preference Optimization☆24Apr 14, 2024Updated last year
- Distributed training (multi-node) of a Transformer model☆94Apr 10, 2024Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆365Sep 25, 2023Updated 2 years ago
- ☆12Nov 15, 2022Updated 3 years ago
- Offical Code For "Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models"☆19Mar 25, 2025Updated 11 months ago
- Because it's there.☆16Sep 22, 2024Updated last year
- Notes on the Mistral AI model☆20Dec 27, 2023Updated 2 years ago
- Pytorch code for experiments on Linear Transformers☆24Jan 12, 2024Updated 2 years ago
- ☆239Jan 2, 2025Updated last year
- A simple Python implementation of forward-forward NN training by G. Hinton from NeurIPS 2022☆21Dec 2, 2022Updated 3 years ago
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆113Feb 2, 2026Updated last month
- Building LLaMA 4 MoE from Scratch☆72Apr 15, 2025Updated 10 months ago
- Attention is all you need implementation☆1,179Jun 8, 2024Updated last year
- Convert English text from written expressions into spoken forms☆28Jun 22, 2022Updated 3 years ago
- VERA-MH official repository☆30Updated this week
- ☆32Mar 27, 2025Updated 11 months ago
- ☆46May 24, 2025Updated 9 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆84Nov 27, 2024Updated last year
- column generation implementation based on google or-tools for cutting stock problem☆14Aug 19, 2025Updated 6 months ago
- Public teaching materials for Reasoning and Agents☆12May 29, 2025Updated 9 months ago
- CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics☆28Nov 1, 2025Updated 4 months ago
- Tree-Invent: A novel molecular generative model constrained with topological tree☆13Jul 26, 2023Updated 2 years ago
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆10Dec 24, 2023Updated 2 years ago
- EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975☆38Dec 14, 2023Updated 2 years ago
- Replicating O1 inference-time scaling laws☆93Dec 1, 2024Updated last year
- An IOT based mobile application to monitor the vitals such as ECG, Body Temperature, Blood Pressure using an ESP32 DevKit and React Nativ…☆11Nov 14, 2024Updated last year
- ☆13Sep 14, 2021Updated 4 years ago
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated last year
- Project focused on enhancing the quality of low-fidelity endoscopy images using Generative Adversarial Networks (GANs) implemented in PyT…☆17Jun 5, 2025Updated 9 months ago
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated last year
- Deploying a custom pytorch model to AWS Sagemaker using terraform and FastAPI☆10Nov 10, 2023Updated 2 years ago
- ☆11Aug 28, 2025Updated 6 months ago
- [ACL 2023] Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generati…☆10Sep 23, 2023Updated 2 years ago
- JMLR Cover Letter Template☆10Dec 15, 2021Updated 4 years ago
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 7 months ago
- NeurIPS 2024 tutorial on LLM Inference☆49Dec 10, 2024Updated last year
- Instantly fix problems with ChatGPT AI. Use ChatGPT and GPT-4 AI tools to find one-click 'lightbulb menu' solutions to problems in your c…☆12Mar 26, 2023Updated 2 years ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated last year