☆23Nov 11, 2024Updated last year
Alternatives and similar repositories for orpo
Users that are interested in orpo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation of Avoiding spurious correlations via logit correction☆17May 6, 2023Updated 3 years ago
- ☆19Aug 4, 2025Updated 9 months ago
- Code for the CCE algorithm proposed in "Towards Compositionality in Concept Learning" at ICML 2024.☆16Jun 2, 2024Updated last year
- Data and Code for StructuredRegex.☆14Nov 16, 2023Updated 2 years ago
- Batch Multi-Fidelity Bayesian Optimization with Deep Auto-Regressive Networks☆12Nov 3, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training☆32Jan 9, 2022Updated 4 years ago
- Code for a model-based version of Constrained Policy Optimization☆11May 6, 2021Updated 5 years ago
- Model-based reinforcement learning using CEM, MPC and PETS☆16Nov 20, 2019Updated 6 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Nov 22, 2022Updated 3 years ago
- Ready to run PyTorch implementation of Data2Vec 2.0: Highly efficient self-supervised representation learning for vision, speech and text…☆16Mar 29, 2023Updated 3 years ago
- ☆23Jun 15, 2022Updated 3 years ago
- Code for paper "Prompt Engineering a Prompt Engineer" (https://arxiv.org/abs/2311.05661)☆10Aug 1, 2024Updated last year
- NeurIPS 2025: Discriminative Constrained Optimization for Reinforcing Large Reasoning Models☆53Mar 14, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The data and the PyTorch implementation for the models and experiments in the paper "Language Model Decoding as Likelihood–Utility Alignm…☆14Sep 7, 2023Updated 2 years ago
- [NeurIPS 2021] “When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning?”☆49Nov 21, 2021Updated 4 years ago
- [ICLR'21] Dataset Inference for Ownership Resolution in Machine Learning☆31Oct 10, 2022Updated 3 years ago
- This is a repository for code, data, and models associated with the paper LLM-RUBRIC: A Multidimensional, Calibrated Approach to Automate…☆32Mar 30, 2026Updated last month
- Multi-step reasoning MLLM☆23Mar 8, 2026Updated 2 months ago
- NeurIPS 2024 tutorial on LLM Inference☆49Dec 10, 2024Updated last year
- Official code implementation for the paper "Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Expl…☆12Apr 4, 2025Updated last year
- ☆29Jan 16, 2023Updated 3 years ago
- [ICML 2026 Spotlight] Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback☆66May 4, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- "Semantic Evaluation for Text-to-SQL with Distilled Test Suite", EMNLP2020☆42Dec 1, 2020Updated 5 years ago
- We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…☆20May 20, 2025Updated last year
- On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [NeurIPS 2020]☆35Jul 3, 2021Updated 4 years ago
- PyTorch implementation of Count-Based Exploration with Neural Density Models☆10Mar 22, 2018Updated 8 years ago
- Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off☆32Apr 28, 2022Updated 4 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- Count based exploration with the successor representation for Unity ML's Pyramid☆12Jun 19, 2019Updated 6 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 3 years ago
- Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"☆50Jun 26, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆11Mar 6, 2022Updated 4 years ago
- PyTorch implementations of Reinforcement Learning algorithms in less than 200 lines☆10Apr 3, 2020Updated 6 years ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use☆30Nov 4, 2025Updated 6 months ago
- An official implementation of ProbeGen☆13Oct 20, 2024Updated last year
- ☆15Dec 14, 2024Updated last year
- Code for generating a single image pretraining dataset☆13Aug 3, 2021Updated 4 years ago
- code associated with WANLI dataset in Liu et al., 2022☆30May 24, 2023Updated 3 years ago