Dataset for the Tensor Trust project
☆47Mar 17, 2024Updated 2 years ago
Alternatives and similar repositories for tensor-trust-data
Users that are interested in tensor-trust-data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆75Nov 10, 2025Updated 7 months ago
- Official Repository for Can Language Models be Instructed to Protect Personal Information?☆13Oct 8, 2023Updated 2 years ago
- Pronto is an automation suite for deploying and managing DataStax Cassandra clusters in AWS.☆14Jul 1, 2020Updated 5 years ago
- Augmenting Statistical Models with Natural Language Parameters☆28Sep 17, 2024Updated last year
- RuLES: a benchmark for evaluating rule-following in language models☆254Feb 24, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Large Language Model for Blockchain☆58May 24, 2023Updated 3 years ago
- robust polynomial multiplication in modulo m☆19Apr 30, 2016Updated 10 years ago
- A text-based game where language models learn to lie and to detect lies.☆12Oct 4, 2023Updated 2 years ago
- We present **FOCI**, a benchmark for Fine-grained Object ClassIfication for large vision language models (LVLMs).☆19Jun 21, 2024Updated last year
- ☆31Jul 14, 2023Updated 2 years ago
- Evaluate Transformers from the Hub 🔥☆14May 26, 2026Updated 2 weeks ago
- [ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"☆12Dec 6, 2024Updated last year
- A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.☆11Nov 12, 2025Updated 7 months ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆56Aug 17, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Panda Guard is designed for researching jailbreak attacks, defenses, and evaluation algorithms for large language models (LLMs).☆68Mar 23, 2026Updated 2 months ago
- Adversarial detection and defense for deep learning systems using robust feature alignment☆17Nov 10, 2020Updated 5 years ago
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…☆445Jan 22, 2025Updated last year
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- ☆12Jun 13, 2025Updated last year
- Code and data for "Inferring Rewards from Language in Context" [ACL 2022].☆16May 22, 2022Updated 4 years ago
- ☆139Jul 7, 2025Updated 11 months ago
- Code for "A Principled Framework for Multi-View Contrastive Learning"☆20Jul 10, 2025Updated 11 months ago
- ☆19Aug 10, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Improving Alignment and Robustness with Circuit Breakers☆262Sep 24, 2024Updated last year
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 5 months ago
- An isolated environment for DNS cache poisoning attack investigation and demonstration.☆10Nov 22, 2020Updated 5 years ago
- A Complex Judge, designed for ACOJ.☆10Sep 5, 2016Updated 9 years ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆117Jun 13, 2024Updated 2 years ago
- Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment☆111Mar 8, 2024Updated 2 years ago
- A starter kit for evaluating benchmarks on the 🤗 Hub☆16Apr 8, 2026Updated 2 months ago
- ☆17Mar 20, 2025Updated last year
- TAP: An automated jailbreaking method for black-box LLMs☆237Dec 10, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆19Apr 7, 2025Updated last year
- Code for a research paper "Part-Based Models Improve Adversarial Robustness" (ICLR 2023)☆20Sep 16, 2023Updated 2 years ago
- ☆16Mar 22, 2025Updated last year
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated last year
- ☆11Jan 19, 2025Updated last year
- Use GPT-3 to generate competitive programming ideas.☆12Feb 29, 2024Updated 2 years ago
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 7 months ago