SafeArena is a benchmark for assessing the harmful capabilities of web agents
☆21Apr 23, 2025Updated 11 months ago
Alternatives and similar repositories for safearena
Users that are interested in safearena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Synthetic Data Generation for Evaluation☆13Feb 21, 2025Updated last year
- TACL 2025: Investigating Adversarial Trigger Transfer in Large Language Models☆19Aug 17, 2025Updated 7 months ago
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆41Aug 7, 2025Updated 7 months ago
- Paper: Lexicon Learning for Few-Shot Neural Sequence Modeling☆16Jan 8, 2022Updated 4 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆23Jan 6, 2026Updated 2 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆605Oct 7, 2025Updated 5 months ago
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆13Mar 16, 2023Updated 3 years ago
- Visual Verb Sense Disambiguation☆13Apr 26, 2019Updated 6 years ago
- This is the repository of the Dense Hierarchical Retrieval for Open-Domain Question Answering☆14Dec 23, 2021Updated 4 years ago
- ☆11Feb 28, 2024Updated 2 years ago
- EMNLP 2020: On the Ability and Limitations of Transformers to Recognize Formal Languages☆24Oct 10, 2020Updated 5 years ago
- More Information about Features, Deliverables and Publications @☆11May 17, 2016Updated 9 years ago
- A Benchmark for Evaluating Safety and Trustworthiness in Web Agents for Enterprise Scenarios☆21Mar 12, 2026Updated 2 weeks ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback☆12Jul 13, 2022Updated 3 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆87Aug 12, 2024Updated last year
- Scorpius: Poisoning scientific knowledge using large language models☆11Aug 3, 2024Updated last year
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns☆13Mar 1, 2025Updated last year
- Data splits for the NAACL 2016 paper☆22Mar 17, 2016Updated 10 years ago
- Implementation of the Mask R-CNN model using OCaml's numerical library Owl.☆19Jan 30, 2020Updated 6 years ago
- Predicting Hashtag from Instagram pictures using Tensorflow, TFRecords, and TF-Slim☆15Nov 7, 2016Updated 9 years ago
- customized dash NGL viewer☆12Jan 6, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"☆14Mar 25, 2025Updated last year
- Tools for managing BibTeX bibliographies: automatically update preprints to published versions and filter to only cited references.☆81Feb 22, 2026Updated last month
- Curated list of awesome ML Visualization Libraries☆13Jun 23, 2023Updated 2 years ago
- ACL 2023 paper "A Critical Evaluation of Evaluations for Long-form Question Answering"☆21Mar 22, 2024Updated 2 years ago
- AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM☆86Nov 3, 2024Updated last year
- An enterprise deep research benchmark☆35Mar 22, 2026Updated last week
- ☆24Feb 4, 2026Updated last month
- [WSDM'2025] "MixRec: Heterogeneous Graph Collaborative Filtering"☆21Dec 19, 2024Updated last year
- code of paper "Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM"☆14Nov 17, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)☆21Jun 2, 2025Updated 9 months ago
- Code for "On Measuring Faithfulness of Natural Language Explanations"☆21Jul 23, 2024Updated last year
- [NDSS'25] The official implementation of safety misalignment.☆17Jan 8, 2025Updated last year
- enchmarking Large Language Models' Resistance to Malicious Code☆14Dec 1, 2024Updated last year
- Common repo and documentation space for DataMeet Pune chapter☆16Jun 7, 2019Updated 6 years ago
- Aurora is a central design system for all products and applications for the Open, Accessible Digital Workspace. This repo is for all code…☆16Feb 23, 2024Updated 2 years ago
- [SIGIR'22] Official PyTorch implementation for "Learning to Denoise Unreliable Interactions for Graph Collaborative Filtering".☆18Oct 24, 2022Updated 3 years ago