☆22Nov 11, 2024Updated last year
Alternatives and similar repositories for orpo
Users that are interested in orpo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation of Avoiding spurious correlations via logit correction☆17May 6, 2023Updated 2 years ago
- ☆19Aug 4, 2025Updated 8 months ago
- Data and Code for StructuredRegex.☆14Nov 16, 2023Updated 2 years ago
- Tasks for describing differences between text distributions.☆17Aug 9, 2024Updated last year
- Batch Multi-Fidelity Bayesian Optimization with Deep Auto-Regressive Networks☆12Nov 3, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training☆32Jan 9, 2022Updated 4 years ago
- Code repository for On the interaction between supervision and self-play in emergent communication (ICLR 2020)☆15Feb 4, 2020Updated 6 years ago
- Code for a model-based version of Constrained Policy Optimization☆11May 6, 2021Updated 4 years ago
- Synthetic low- and medium-voltage power distribution grids in Switzerland.☆16Apr 7, 2025Updated last year
- Model-based reinforcement learning using CEM, MPC and PETS☆16Nov 20, 2019Updated 6 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Nov 22, 2022Updated 3 years ago
- Run MPC-based policies and train RL agents in gym-anm environments using implementations from Stable Baselines 3.☆12Mar 2, 2023Updated 3 years ago
- 基于强化学习的列车节能 Subway Train energy efficient Algorithm base on Reinforcement Learning☆12Jul 4, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- NeurIPS 2025: Discriminative Constrained Optimization for Reinforcing Large Reasoning Models☆53Mar 14, 2026Updated last month
- Microgrid/distribution network level energy market managed by an RL agent☆12Feb 19, 2021Updated 5 years ago
- Competitive Reinforcement Learning for Real-Time Pricing and Scheduling Control in Coupled EV Charging Stations and Power Networks☆13Jun 10, 2023Updated 2 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Language Model Decoding as Likelihood–Utility Alignm…☆14Sep 7, 2023Updated 2 years ago
- [ICLR'21] Dataset Inference for Ownership Resolution in Machine Learning☆31Oct 10, 2022Updated 3 years ago
- This repository contains the code to train the baseline agent provided in the 2022 edition of Learning to Run a Power Network and to recr…☆15Aug 2, 2022Updated 3 years ago
- This is a repository for code, data, and models associated with the paper LLM-RUBRIC: A Multidimensional, Calibrated Approach to Automate…☆29Mar 30, 2026Updated 2 weeks ago
- Optimal power flow by using PSO☆12Aug 12, 2019Updated 6 years ago
- ☆15Apr 8, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆64Mar 8, 2026Updated last month
- Official code implementation for the paper "Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Expl…☆12Apr 4, 2025Updated last year
- NeurIPS 2024 tutorial on LLM Inference☆49Dec 10, 2024Updated last year
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- ☆29Jan 16, 2023Updated 3 years ago
- Ultra fast power flow for scenario analysis.☆19Apr 19, 2024Updated 2 years ago
- ☆52Oct 10, 2024Updated last year
- ☆89Apr 11, 2026Updated last week
- We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…☆20May 20, 2025Updated 11 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official implementation for "PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning" (NeurIPS 2024)☆19Oct 13, 2024Updated last year
- On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [NeurIPS 2020]☆35Jul 3, 2021Updated 4 years ago
- PyTorch implementation of Count-Based Exploration with Neural Density Models☆10Mar 22, 2018Updated 8 years ago
- Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off☆32Apr 28, 2022Updated 3 years ago
- [ICLR 2022] "Sparsity Winning Twice: Better Robust Generalization from More Efficient Training" by Tianlong Chen*, Zhenyu Zhang*, Pengjun…☆40Mar 20, 2022Updated 4 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 3 years ago