The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".
☆17Jun 20, 2024Updated last year
Alternatives and similar repositories for gpo
Users that are interested in gpo are comparing it to the libraries listed below
Sorting:
- Implicit Differentiable Optimal Control (IDOC) with JAX☆12May 11, 2022Updated 3 years ago
- ☆13Feb 11, 2021Updated 5 years ago
- ☆18Feb 7, 2021Updated 5 years ago
- Official PyTorch Implementation for Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning☆19Jan 11, 2023Updated 3 years ago
- ☆24Oct 21, 2024Updated last year
- OpenAI Gym environment for DART robotics simulator.☆22Apr 17, 2018Updated 7 years ago
- Plannable Approximations to MDP Homomorphisms: Equivariance under Actions☆30Jun 30, 2020Updated 5 years ago
- Tools for visualizing and comparing data from vertebrate retinas☆14Jan 20, 2025Updated last year
- JAX implementation of the JKOnet* architecture presented in "Learning Diffusion at Lightspeed".☆37Mar 18, 2025Updated 11 months ago
- Code for minimum-entropy coupling.☆32Jan 6, 2026Updated 2 months ago
- TD-Regularized Actor-Critic Methods☆36Dec 26, 2019Updated 6 years ago
- Repo for the paper "Landscape Surrogate Learning Decision Losses for Mathematical Optimization Under Partial Information"☆38Jul 20, 2023Updated 2 years ago
- Code release for "Stochastic Optimal Control Matching"☆39Aug 14, 2024Updated last year
- JAX implementation of VQVAE/VQGAN autoencoders (+FSQ)☆42Jun 6, 2024Updated last year
- Understanding Short-Horizon Bias in Stochastic Meta-Optimization☆37Mar 8, 2018Updated 8 years ago
- Tree-structured recurrent switching linear dynamical systems☆38Jul 13, 2020Updated 5 years ago
- My CV☆39Updated this week
- ☆16Jun 25, 2025Updated 8 months ago
- Active Learning with Partial Feedback, ICLR 2019☆11Apr 27, 2020Updated 5 years ago
- ☆11Jun 15, 2019Updated 6 years ago
- ⚠️ ARCHIVED - All development moved to https://github.com/itbench-hub/ITBench/tree/main/scenarios☆15Feb 24, 2026Updated last week
- ☆10Feb 17, 2019Updated 7 years ago
- Microsoft question-answering dataset☆10Jun 16, 2023Updated 2 years ago
- Implementation for ICML 2019 paper, EMI: Exploration with Mutual Information.☆37Dec 7, 2020Updated 5 years ago
- How to train a neural ODE for time series/weather forecasting☆39Feb 6, 2023Updated 3 years ago
- Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)☆43Dec 8, 2022Updated 3 years ago
- JAX implementation of GPTQ quantization algorithm☆10Jul 19, 2023Updated 2 years ago
- ☆10Jun 4, 2024Updated last year
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated 10 months ago
- The package is developed for treatment recommendation & pairwise treatment individual effect estimation (ITE/CATE/HTE) when multiple trea…☆11Mar 9, 2023Updated 3 years ago
- A tiny reinforcement learning codebase for continuous control, built on top of JAX.☆15Mar 28, 2023Updated 2 years ago
- Gym wrapper for Vizdoom environments☆12Dec 14, 2018Updated 7 years ago
- Gym implementation of connector to Deepmind lab☆12Mar 26, 2019Updated 6 years ago
- Code for "Using Embeddings to Correct for Unobserved Confounding"☆10May 31, 2019Updated 6 years ago
- Jax implementation of VIT-VQGAN☆10Jan 25, 2024Updated 2 years ago
- Multi-Objective Causal Bayesian Optimisation, a new paradigm for finding Pareto-optimal interventions in multi-outcome causal models☆16Jun 2, 2025Updated 9 months ago
- [IJCAI'23] Speeding Up Multi-Objective Hyperparameter Optimization by Task Similarity-Based Meta-Learning for the Tree-Structured Parzen …☆10Mar 9, 2024Updated 2 years ago
- JAX/Haiku implementation of "Auction Learning as a Two-Player Game"☆11Jul 6, 2024Updated last year
- [ICSE 2023] Differentiable interpretation and failure-inducing input generation for neural network numerical bugs.☆13Jan 5, 2024Updated 2 years ago