A prompt injection game to collect data for robust ML research
☆70Jan 27, 2025Updated last year
Alternatives and similar repositories for tensor-trust
Users that are interested in tensor-trust are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Dataset for the Tensor Trust project☆48Mar 17, 2024Updated 2 years ago
- Course Planning tool for the CS major @ UC Chile.☆24Dec 30, 2025Updated 5 months ago
- The Infibench variant of bigcode-evaluation-harness --- a framework for the evaluation of autoregressive code generation language models.☆14Oct 19, 2024Updated last year
- Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime☆54Sep 19, 2023Updated 2 years ago
- Minimal workflows☆21Mar 19, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementations and demo of a regular Backdoor and a Latent backdoor attack on Deep Neural Networks.☆19Jul 9, 2022Updated 3 years ago
- [NeurIPS 2023] Official Pytorch code for LOVM: Language-Only Vision Model Selection☆21Feb 3, 2024Updated 2 years ago
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…☆445Jan 22, 2025Updated last year
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆588Feb 27, 2026Updated 3 months ago
- Ghidra consonance and make it more ida-ish☆16Mar 11, 2019Updated 7 years ago
- TAP: An automated jailbreaking method for black-box LLMs☆237Dec 10, 2024Updated last year
- Qualifying Exam Preparing☆17May 7, 2025Updated last year
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆28Mar 4, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆140Jul 7, 2025Updated 11 months ago
- [ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"☆12Dec 6, 2024Updated last year
- A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.☆11Nov 12, 2025Updated 7 months ago
- ☆18Nov 8, 2024Updated last year
- An automated data pipeline scaling RL to pretraining levels☆77Jun 2, 2026Updated 2 weeks ago
- Benchmark to estimate model sycophancy☆30Nov 30, 2025Updated 6 months ago
- Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizX…☆89Mar 15, 2024Updated 2 years ago
- State-Relabeling Adversarial Active Learning☆14Aug 17, 2021Updated 4 years ago
- This repo implements the CVPR23 paper Trainable Projected Gradient Method for Robust Fine-tuning☆24Nov 27, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆23Nov 8, 2023Updated 2 years ago
- ☆14Dec 28, 2024Updated last year
- Code for the paper "Unbiased Supervised Contrastive Learning" | ICLR 2023 https://openreview.net/forum?id=Ph5cJSfD2XN☆12Sep 22, 2023Updated 2 years ago
- Important ideas☆18Oct 13, 2025Updated 8 months ago
- Text file containing NSFW words aggregated from various sources.☆11Aug 23, 2020Updated 5 years ago
- Code for "A Principled Framework for Multi-View Contrastive Learning"☆20Jul 10, 2025Updated 11 months ago
- ☆12Oct 23, 2022Updated 3 years ago
- upload a manim script and generate an animation☆11Mar 10, 2024Updated 2 years ago
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆75Nov 10, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆744Jul 2, 2025Updated 11 months ago
- ☆13Oct 21, 2021Updated 4 years ago
- ☆45Jun 10, 2024Updated 2 years ago
- Code for the papers: "Stop Throwing Away Discriminators! Re-using Adversaries for Test-Time Training", Valvano et al., DART 2021; and "Re…☆10Jan 20, 2022Updated 4 years ago
- Code to conduct an embedding attack on LLMs☆32Jan 10, 2025Updated last year
- Code for Paper "The Geometry of Reasoning: Flowing Logics in Representation Space" (ICLR 2026)☆52Jan 31, 2026Updated 4 months ago
- Simple CLI demo for chatting with LIFI docs☆13Apr 18, 2023Updated 3 years ago