Bilkent-CYBORG / VOPyLinks
A Framework for Black-box Vector Optimization
☆32Updated last month
Alternatives and similar repositories for VOPy
Users that are interested in VOPy are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Official implementation of DICL (Disentangled In-Context Learning), featured in the paper "Zero-shot Model-based Reinforcemen…☆26Updated 7 months ago
- Causal Agent based on Large Language Model☆51Updated 2 weeks ago
- ☆54Updated 3 months ago
- This repository contains code for the paper "Learning Decision Trees as Amortized Structure Inference"☆15Updated 6 months ago
- Gradient Boosting Reinforcement Learning (GBRL)☆118Updated last month
- GBRL-based Actor-Critic algorithms implemented in stable-baselines3☆38Updated last month
- Repo to reproduce the First-Explore paper results☆38Updated 9 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆111Updated last month
- Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories☆42Updated 2 years ago
- ☆64Updated 5 months ago
- Drop-in environment replacements that make your RL algorithm train faster.☆21Updated last year
- Revisiting Hierarchical Text Classification : Inference and Metrics☆13Updated 9 months ago
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆27Updated 2 months ago
- The original Shared Recurrent Memory Transformer implementation☆31Updated 2 months ago
- Official source code for "Graph Neural Networks for Learning Equivariant Representations of Neural Networks". In ICLR 2024 (oral).☆82Updated last year
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆25Updated 2 weeks ago
- Bayes-Adaptive RL for LLM Reasoning☆39Updated 3 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆37Updated last week
- ☆19Updated 6 months ago
- Dateset Reset Policy Optimization☆30Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- [ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"☆17Updated last year
- ☆35Updated 9 months ago
- Efficient Scaling laws and collaborative pretraining.☆18Updated last week
- Retrieval-Augmented Decision Transformer: External Memory for In-context RL☆22Updated 10 months ago
- ☆22Updated 11 months ago
- ☆229Updated last month
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆20Updated 6 months ago
- Extending Conformal Prediction to LLMs☆67Updated last year
- implementation of dualformer☆20Updated 6 months ago