Dataset for the Tensor Trust project
☆48Mar 17, 2024Updated 2 years ago
Alternatives and similar repositories for tensor-trust-data
Users that are interested in tensor-trust-data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A prompt injection game to collect data for robust ML research☆69Jan 27, 2025Updated last year
- ☆15Jun 7, 2024Updated last year
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆68Nov 10, 2025Updated 5 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆515Mar 30, 2026Updated 2 weeks ago
- Tasks for describing differences between text distributions.☆17Aug 9, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆23May 20, 2025Updated 10 months ago
- RuLES: a benchmark for evaluating rule-following in language models☆247Feb 24, 2025Updated last year
- Large Language Model for Blockchain☆59May 24, 2023Updated 2 years ago
- robust polynomial multiplication in modulo m☆19Apr 30, 2016Updated 9 years ago
- We present **FOCI**, a benchmark for Fine-grained Object ClassIfication for large vision language models (LVLMs).☆19Jun 21, 2024Updated last year
- ☆31Jul 14, 2023Updated 2 years ago
- [ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"☆12Dec 6, 2024Updated last year
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆56Aug 17, 2024Updated last year
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…☆436Jan 22, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Panda Guard is designed for researching jailbreak attacks, defenses, and evaluation algorithms for large language models (LLMs).☆66Mar 23, 2026Updated 3 weeks ago
- Adversarial detection and defense for deep learning systems using robust feature alignment☆17Nov 10, 2020Updated 5 years ago
- Recognize faces/objects in a video stream (from a webcam or a security camera) and send notifications to your devices☆12May 12, 2024Updated last year
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- Code and data for "Inferring Rewards from Language in Context" [ACL 2022].☆16May 22, 2022Updated 3 years ago
- The respository describing a novel datasets for word association explanations☆13Sep 21, 2023Updated 2 years ago
- ☆19Aug 10, 2024Updated last year
- Improving Alignment and Robustness with Circuit Breakers☆260Sep 24, 2024Updated last year
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 4 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- derper's mom☆13Oct 28, 2025Updated 5 months ago
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆152Jul 19, 2024Updated last year
- A Complex Judge, designed for ACOJ.☆10Sep 5, 2016Updated 9 years ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆115Jun 13, 2024Updated last year
- TAP: An automated jailbreaking method for black-box LLMs☆225Dec 10, 2024Updated last year
- Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment☆109Mar 8, 2024Updated 2 years ago
- Public repo for ETH Escape CTF @ Devcon 2024: https://devcon.org/☆13Dec 11, 2024Updated last year
- A starter kit for evaluating benchmarks on the 🤗 Hub☆16Updated this week
- ☆16Mar 20, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- line-drawer recreates a given image by only drawing it by simple straight lines. Implementation inspired by linify.me and written in Pyth…☆14Aug 31, 2023Updated 2 years ago
- A symbolic benchmark for verifiable chain-of-thought financial reasoning. Includes executable templates, 58 topics across 12 domains, and…☆28Dec 26, 2025Updated 3 months ago
- ☆18Apr 7, 2025Updated last year
- ☆16Mar 22, 2025Updated last year
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated last year
- ☆11Jan 19, 2025Updated last year
- Use GPT-3 to generate competitive programming ideas.☆11Feb 29, 2024Updated 2 years ago