☆52Feb 20, 2026Updated 2 weeks ago
Alternatives and similar repositories for trl-tuto
Users that are interested in trl-tuto are comparing it to the libraries listed below
Sorting:
- This repository contains all code examples for my TensorFlow World talk about "Advanced model deployments with TensorFlow Serving"☆17Dec 8, 2022Updated 3 years ago
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆30Jan 10, 2026Updated last month
- A Survey Analyzing Generalization in Deep Reinforcement Learning☆37Oct 31, 2024Updated last year
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆25Feb 16, 2026Updated 2 weeks ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- A Transfer Learning Study of Gas Adsorption in Metal-Organic Frameworks☆14Jul 15, 2020Updated 5 years ago
- DiagnoSys is a comprehensive web application that provides advanced detection and analysis for various health conditions. This project le…☆14May 6, 2024Updated last year
- ☆16Feb 22, 2025Updated last year
- DreamSmooth: Improving Model-Based RL with Reward Smoothing (ICLR 2024)☆12May 6, 2024Updated last year
- Simple model for sentence compression (a.k.a Baseline in Klerke et al., NAACL 2016)☆10Dec 16, 2018Updated 7 years ago
- The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.☆13Jun 17, 2024Updated last year
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆29Feb 23, 2026Updated last week
- About Code release for "Imagination Mechanism: Mesh Information Propagation for Enhancing Data Efficiency in Reinforcement Learning"☆13Oct 7, 2023Updated 2 years ago
- ☆14Mar 21, 2024Updated last year
- code for polite☆11Feb 28, 2024Updated 2 years ago
- ☆10Nov 16, 2023Updated 2 years ago
- LLM Skirmish☆44Feb 3, 2026Updated last month
- A collection of heat engines, based on the OpenAI Gym environment framework for use with reinforcement learning applications.☆15Dec 20, 2021Updated 4 years ago
- Teaching a humanoid to walk(ish), then displaying in your browser (using tensorflow.js and reinforcement learning)☆10Sep 7, 2020Updated 5 years ago
- A protein language model for learning the SARS-CoV-2 fitness landscape☆12Apr 22, 2025Updated 10 months ago
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated 2 weeks ago
- ☆11Jan 11, 2022Updated 4 years ago
- Robot simulator using web technologies, just JavaScript☆10Feb 13, 2020Updated 6 years ago
- Агрегированный проект методов искусственного интеллекта и машинного обучения☆11Oct 16, 2017Updated 8 years ago
- Neural Networks for penetration testing. Part of active research.☆13Jun 21, 2022Updated 3 years ago
- Neural Turing Machine☆13Jun 18, 2018Updated 7 years ago
- Hands-on repository for fine-tuning Large Language Models (LLMs) in the clinical domain with tutorials☆13Jan 9, 2026Updated last month
- My Very Own Deep Multiple Layered Echo State Network☆13Jan 2, 2021Updated 5 years ago
- Sample code and documentation for using the Microsoft HoloLens for Computer Vision research☆10Feb 14, 2022Updated 4 years ago
- 🐲 Stanford CS234 : Reinforcement Learning☆12Jan 14, 2019Updated 7 years ago
- ☆11Jan 17, 2025Updated last year
- Code for calculating grouped representation of interatomic distances (GRID) from crystal structures, and applying this in machine learnin…☆12Jun 22, 2023Updated 2 years ago
- A Very Simple Demo of Fine Tuning Sentence Transformers☆15Jun 15, 2023Updated 2 years ago
- Worldquant University's Capstone Project☆14Sep 5, 2023Updated 2 years ago
- Autonomous Agent for Kubernetes☆14Feb 14, 2025Updated last year
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆10Dec 12, 2023Updated 2 years ago
- Generating Protein Variants with Different Generative Models (HMM, VAE, ESM-2, ProtGPT2)☆11Mar 14, 2024Updated last year
- Advanced Data Science with IBM Specialization☆12Aug 9, 2021Updated 4 years ago
- https://github.com/mitsuba-renderer/mitsuba2 in docker☆10Jun 13, 2020Updated 5 years ago