google-deepmind / nonstationary_mbmlLinks
Memory-Based Meta-Learning on Non-Stationary Distributions
☆17Updated last year
Alternatives and similar repositories for nonstationary_mbml
Users that are interested in nonstationary_mbml are comparing it to the libraries listed below
Sorting:
- Supplementary Data for Evolving Reinforcement Learning Algorithms☆47Updated 4 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆92Updated last year
- ☆31Updated 3 years ago
- Clean RL implementation using MLX☆34Updated last year
- Generative cellular automaton-like learning environments for RL.☆20Updated 11 months ago
- Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"☆31Updated 2 years ago
- Repo to reproduce the First-Explore paper results☆38Updated last year
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆61Updated 3 years ago
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated 2 years ago
- MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection☆46Updated 2 years ago
- ☆16Updated last year
- ☆19Updated 2 years ago
- Learn online intrinsic rewards from LLM feedback☆45Updated last year
- Drop-in environment replacements that make your RL algorithm train faster.☆21Updated last year
- ☆35Updated last year
- Intrinsic Motivation from Artificial Intelligence Feedback☆134Updated 2 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆39Updated 3 years ago
- A Gymnasium-based Environment of the Abstraction and Reasoning Corpus (ARC)☆69Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- ☆56Updated last year
- ☆26Updated 3 years ago
- Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories☆42Updated 2 years ago
- ☆23Updated 4 years ago
- Triton Implementation of HyperAttention Algorithm☆48Updated 2 years ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Updated last year
- GPT implementation in Flax☆18Updated 4 years ago
- Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers" (NeurIPS 2021)☆51Updated 7 months ago
- Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"☆29Updated last year
- ☆27Updated last year