☆34Oct 21, 2025Updated 6 months ago
Alternatives and similar repositories for selfplay-redteaming
Users that are interested in selfplay-redteaming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A library for soft differentiable relaxations of common PyTorch functions.☆69Mar 14, 2026Updated last month
- Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks☆45Apr 13, 2026Updated 2 weeks ago
- Compositional Abstractions Tutorial☆14Nov 26, 2023Updated 2 years ago
- A library for soft differentiable relaxations of common JAX functions.☆72Apr 9, 2026Updated 3 weeks ago
- ☆12Apr 26, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1☆68Apr 8, 2026Updated 3 weeks ago
- ☆19Aug 19, 2025Updated 8 months ago
- Solving the OpenAI Gym (MountainCarContinuous-v0) with DDPG☆21Jan 23, 2023Updated 3 years ago
- https://interactivetraining.ai/☆17Oct 2, 2025Updated 6 months ago
- Repository for the paper: "TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining" ACL Oral 2025☆23Apr 19, 2026Updated last week
- TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition☆30Feb 5, 2026Updated 2 months ago
- Solving some interesting problems using Python and C++☆14Aug 16, 2020Updated 5 years ago
- ☆17May 19, 2023Updated 2 years ago
- [NAACL 2025] Official Code Repository for the paper "Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval"☆22Jul 13, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Aug 24, 2022Updated 3 years ago
- Official Repository for Task-Circuit Quantization☆25Jun 1, 2025Updated 11 months ago
- Prompt + regex lab☆10Nov 22, 2023Updated 2 years ago
- Make open-weight LLM agents play the game "Among Us", and study how the models learn and express lying and deception in the game.☆32Dec 17, 2025Updated 4 months ago
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 3 years ago
- The Harmonic Memory☆18Oct 18, 2023Updated 2 years ago
- CR-LT KGQA Dataset Repository☆10Jun 1, 2025Updated 11 months ago
- ☆10May 27, 2024Updated last year
- ☆12May 27, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- HumanLM: Simulating Users with State Alignment Beats Response Imitation☆74Feb 27, 2026Updated 2 months ago
- Reverse Engineering Imperceptible Backdoor Attacks on Deep Neural Networks for Detection and Training Set Cleansing☆14Feb 18, 2021Updated 5 years ago
- Code of "Visualizing and Understanding Object Detecor"☆20Jun 24, 2021Updated 4 years ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆20Sep 18, 2025Updated 7 months ago
- The official Python wrapper for the EBSCO Discovery Service API☆16Jul 26, 2024Updated last year
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆30Jun 30, 2025Updated 10 months ago
- Reading Group @mila-iqia on Computational Optimal Transport for Machine Learning Applications☆13Jun 3, 2022Updated 3 years ago
- Codes for the paper "Optimizing Mode Connectivity via Neuron Alignment" from NeurIPS 2020.☆16Dec 10, 2020Updated 5 years ago
- 99 problems, but a driver ain't one. (Push code, not buggies)☆26Oct 12, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The official implementation of the paper "AgentDyn: A Dynamic Open-Ended Benchmark for Evaluating Prompt Injection Attacks of Real-World …☆48Apr 19, 2026Updated last week
- ☆12Feb 21, 2022Updated 4 years ago
- A PyTorch implementation of the paper "Provably Efficient Online RLHF with One-Pass Reward Modeling". This repository provides a flexible…☆92Dec 13, 2025Updated 4 months ago
- The Python programming language☆65Dec 19, 2025Updated 4 months ago
- Forced alignment for karaokes☆22Updated this week
- カードゲームのプロキシ(コピーカード)を簡単に印刷するWebアプリ。☆18May 13, 2025Updated 11 months ago
- ☆16Sep 4, 2024Updated last year