SalesforceAIResearch/UserBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SalesforceAIResearch/UserBench)

SalesforceAIResearch / UserBench

☆63

Alternatives and similar repositories for UserBench

Users that are interested in UserBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SalesforceAIResearch / UserRL
View on GitHub
The raw UserRL repo under construction
☆110Jun 2, 2026Updated last month
Jielin-Qiu / MMWatermark-Robustness
View on GitHub
Evaluating Durability: Benchmark Insights into Multimodal Watermarking
☆12Jun 7, 2024Updated 2 years ago
HenryLHH / fusion
View on GitHub
This is the source code of FUSION, a safety-aware causal representation for generalizable driving agents.
☆28Oct 23, 2024Updated last year
IBM / API-BLEND
View on GitHub
Companion code to https://arxiv.org/abs/2402.15491
☆22Sep 18, 2025Updated 10 months ago
SalesforceAIResearch / LoCoBench
View on GitHub
☆46Jun 2, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
meituan / vitabench
View on GitHub
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
☆23Oct 17, 2025Updated 9 months ago
Jiacheng-Zhu-AIML / AsymmetryLoRA
View on GitHub
Preprint: Asymmetry in Low-Rank Adapters of Foundation Models
☆40Feb 27, 2024Updated 2 years ago
Evanwu1125 / LiteCoT
View on GitHub
☆17Jun 10, 2025Updated last year
qiancheng0 / ModelingAgent
View on GitHub
☆23Sep 7, 2025Updated 10 months ago
Jiacheng-Zhu-AIML / FOT
View on GitHub
Functional Optimal Transport: Map Estimation and Domain Adaptation for Functional data
☆28Jun 7, 2021Updated 5 years ago
zzwkk / MUA-RL
View on GitHub
MUA-RL: MULTI-TURN USER-INTERACTING AGENT REINFORCEMENT LEARNING FOR AGENTIC TOOL USE
☆65Nov 5, 2025Updated 8 months ago
yihangyao / OASIS
View on GitHub
☆20Nov 3, 2024Updated last year
eigent-ai / toolathlon_gym
View on GitHub
Toolathlon-Gym for testing AI agents real-world tool-use capabilities across diverse MCP servers.
☆138Apr 2, 2026Updated 3 months ago
HanjiangHu / camera-motion-smoothing
View on GitHub
This is the official code for CoRL 2022 "Robustness Certification of Visual Perception Models via Camera Motion Smoothing"
☆11Apr 5, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
VisualSphinx / VisualSphinx
View on GitHub
☆17Jun 3, 2025Updated last year
Jiacheng-Zhu-AIML / WGPOT
View on GitHub
The Wasserstein Distance and Optimal Transport Map of Gaussian Processes
☆51Aug 3, 2020Updated 5 years ago
Leezekun / MacRAG
View on GitHub
☆24Jul 2, 2025Updated last year
liuzuxin / Bullet-Safety-Gym
View on GitHub
An open-source framework to benchmark and assess safety specifications of Reinforcement Learning problems.
☆14Aug 25, 2023Updated 2 years ago
hnyu / seditor
View on GitHub
Code release for the paper "Towards Safe Reinforcement Learning with a Safety Editor Policy", Yu et al., arXiv 2022
☆17Apr 3, 2025Updated last year
multimodal-art-projection / KORGym
View on GitHub
☆60May 21, 2025Updated last year
FranxYao / Complexity-Based-Prompting
View on GitHub
Complexity Based Prompting for Multi-Step Reasoning
☆17Mar 10, 2023Updated 3 years ago
cxcscmu / deepresearch_benchmarking
View on GitHub
☆29Mar 10, 2026Updated 4 months ago
Continual-Lifelong-Learning / resources
View on GitHub
☆17Feb 21, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OS-Copilot / ScienceBoard
View on GitHub
[ICLR 2026] Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"
☆131Feb 2, 2026Updated 5 months ago
sunnweiwei / PPP-Agent
View on GitHub
Training Proactive and Personalized LLM Agents
☆112Jan 20, 2026Updated 6 months ago
SalesforceAIResearch / LoCoBench-Agent
View on GitHub
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
☆22Jun 2, 2026Updated last month
abdulhaim / consistent-LLMs
View on GitHub
☆19Nov 5, 2025Updated 8 months ago
Linn3a / siren
View on GitHub
Official implementation of Selective Entropy Regularization (SIREN), proposed by paper 'Rethinking Entropy Regularization in Large Reason…
☆32Dec 10, 2025Updated 7 months ago
GregxmHu / OccuBench
View on GitHub
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models
☆21Apr 14, 2026Updated 3 months ago
GilgameshD / GRADER
View on GitHub
This is the official implementation of NeurIPS 2022 paper "Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal R…
☆35Jan 25, 2023Updated 3 years ago
UMass-Embodied-AGI / BudgetGuidance
View on GitHub
[ACL'26 Findings] Steering LLM Thinking with Budget Guidance
☆32Feb 19, 2026Updated 5 months ago
shadowkiller33 / ParaScore
View on GitHub
☆31Apr 14, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Jielin-Qiu / MM_Robustness
View on GitHub
[DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
☆39Jan 25, 2024Updated 2 years ago
zorazrw / agent-skill-induction
View on GitHub
Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"
☆42Apr 24, 2025Updated last year
IBM / ColPret
View on GitHub
Efficient Scaling laws and collaborative pretraining.
☆22Updated this week
allenai / beacon
View on GitHub
On-the-fly Definition Augmentation of LLMs for Biomedical NER
☆14Apr 14, 2025Updated last year
salesforce / SRMA
View on GitHub
Contrastive Learning with Model Augmentation
☆18Jun 2, 2026Updated last month
scrambledpie / GPVAE
View on GitHub
Train and visualise a latent variable model of moving objects.
☆16Apr 28, 2020Updated 6 years ago
abhishekpanigrahi1996 / Skill-Localization-by-grafting
View on GitHub
☆52Jan 1, 2024Updated 2 years ago