jylee425 / mobilesafetybench
Evaluating Safety of Autonomous Agents in Mobile Device Control
☆19Updated last month
Alternatives and similar repositories for mobilesafetybench:
Users that are interested in mobilesafetybench are comparing it to the libraries listed below
- Rewarded soups official implementation☆54Updated last year
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆57Updated 2 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆40Updated last year
- ☆36Updated last year
- Implementation of ICML 2023 paper: Future-conditioned Unsupervised Pretraining for Decision Transformer☆27Updated last year
- Guide Your Agent with Adaptive Multimodal Rewards (NeurIPS 2023 Accepted)☆33Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆70Updated 6 months ago
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆24Updated last week
- Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)☆41Updated 7 months ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation)☆30Updated 2 months ago
- ☆124Updated 7 months ago
- Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-informat…☆16Updated last year
- Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt…☆22Updated 6 months ago
- [ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!☆34Updated 7 months ago
- Paper collections of the continuous effort start from World Models.☆167Updated 7 months ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)☆61Updated 7 months ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆74Updated 7 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆136Updated 11 months ago
- Official code for "Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning".☆40Updated 10 months ago
- ☆31Updated 11 months ago
- ☆17Updated last year
- HAZARD challenge☆28Updated 9 months ago
- Direct preference optimization with f-divergences.☆13Updated 4 months ago
- Code for the paper "Learning Options via Compression" at NeurIPS 2022☆23Updated 2 years ago
- ☆79Updated 8 months ago
- Code for the paper "Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation"☆18Updated last year
- Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder (NeurIPS 2023)☆10Updated 8 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆126Updated 3 months ago