Online Preference Alignment for Language Models via Count-based Exploration
☆18Jan 14, 2025Updated last year
Alternatives and similar repositories for COPO
Users that are interested in COPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accompanying Code for "Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning", ICML 2023☆23Dec 29, 2023Updated 2 years ago
- Official Implementation of "Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach"☆36Apr 6, 2026Updated last month
- A benchmark for evaluating reinforcement learning algorithms that train the policies using imaginary rollouts from LLMs.☆14Nov 4, 2025Updated 6 months ago
- [ICML' 24] The PyTorch implementation of our paper: "Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforc…☆24May 29, 2024Updated last year
- Code space for L4DC paper "State-wise Safe Reinforcement Learning With Pixel Observations"☆11Apr 5, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- LLM-Empowered State Representation for Reinforcement Learning (ICML2024 Accepted paper)☆38Jun 14, 2024Updated last year
- [IROS2024] STAIR: Semantic-Targeted Active Implicit Reconstruction☆17Aug 3, 2024Updated last year
- Feasibility Consistent Representation Learning for Safe Reinforcement Learning (ICML 2024). Current SOTA model-free safe RL algorithm on …☆16Jul 12, 2024Updated last year
- Blog post: how to do deterministic policy gradient with gumbel softmax and why you should do it.☆12Jun 20, 2017Updated 8 years ago
- ☆13May 13, 2025Updated 11 months ago
- Official implementation of paper: LiNo: Advancing Recursive Residual Decomposition of Linear and Nonlinear Patterns for Robust Time Serie…☆18Dec 19, 2025Updated 4 months ago
- Robust and safe deep reinforcement learning algorithms☆17Mar 27, 2024Updated 2 years ago
- ☆16Jun 12, 2024Updated last year
- G-HER algorithm☆18May 24, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆20Nov 3, 2024Updated last year
- ☆40May 19, 2025Updated 11 months ago
- ☆13Jun 4, 2025Updated 11 months ago
- The repository is for Reinforcement-Learning Uncertainty research, in which we investigate various uncertain factors in RL.☆23Jun 16, 2023Updated 2 years ago
- Pessimistic Value Iteration for Multi-Task Data Sharing in Offline RL☆18Nov 21, 2023Updated 2 years ago
- [ICLR 2026]The official implementation of The paper "Exploring the Potential of Encoder-free Architectures in 3D LMMs"☆10Jan 26, 2026Updated 3 months ago
- This is the source code of FUSION, a safety-aware causal representation for generalizable driving agents.☆26Oct 23, 2024Updated last year
- ☆12Nov 10, 2020Updated 5 years ago
- The official implementation of "Transformer in Transformer as Backbone for Deep Reinforcement Learning"☆59Dec 27, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ICML 2024 - Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning☆10Jul 16, 2024Updated last year
- ☆16Nov 1, 2023Updated 2 years ago
- [CVPR 2025] The official implementation of "CacheQuant: Comprehensively Accelerated Diffusion Models"☆48Nov 2, 2025Updated 6 months ago
- Code for paper Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety.☆20May 22, 2022Updated 3 years ago
- Official Implementation of "Align-Then-stEer: Adapting the Vision-Language Action Models through Unified Latent Guidance".☆67Oct 16, 2025Updated 6 months ago
- A PyTorch implementation of [VCT](https://github.com/google-research/google-research/tree/master/vct)☆10Nov 25, 2022Updated 3 years ago
- ☆14Dec 11, 2023Updated 2 years ago
- TVCG 2022: Task-Aware Sampling Layer for Point-Wise Analysis☆16Jan 21, 2024Updated 2 years ago
- Metaskill: A Meta-Skill for Autonomous AI Agent Team Generation☆37Feb 23, 2026Updated 2 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆33Jul 15, 2025Updated 9 months ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- Easy to install Text to Speech system for Raspberry Pi 4☆17Mar 4, 2024Updated 2 years ago
- ☆106Jul 18, 2025Updated 9 months ago
- Benchmarking Deepseek R1 API response speeds across different providers for performance comparison.☆10Feb 15, 2025Updated last year
- Please visit our demonstration website for interactive demonstrations☆33Oct 1, 2024Updated last year
- A LLM prompt to give some semblance of referential recursive structure☆24Apr 29, 2026Updated last week