NextWordDev / psychoevalsLinks
Repository for PsychoEvals - a framework for LLM security, psychoanalysis, and moderation.
☆17Updated 2 years ago
Alternatives and similar repositories for psychoevals
Users that are interested in psychoevals are comparing it to the libraries listed below
Sorting:
- Large Language Models Meet NL2Code: A Survey☆35Updated 11 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆433Updated last year
- Analyzing and scoring reasoning traces of LLMs☆46Updated last year
- Can AI-Generated Text be Reliably Detected?☆86Updated last year
- LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces☆98Updated 2 years ago
- Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning (Zhou et al.; EMNLP 2023 Findings)☆17Updated last year
- Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.☆24Updated last year
- Gentopia Agent Zoo and Agent Benchmark☆31Updated 2 years ago
- A collection of works that investigate social agents, simulations and their real-world impact in text, embodied, and robotics contexts.☆98Updated last year
- Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"☆246Updated last year
- A set of utilities for running few-shot prompting experiments on large-language models☆126Updated 2 years ago
- [ICML 2025] Weak-to-Strong Jailbreaking on Large Language Models☆88Updated 6 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆171Updated last year
- ☆47Updated 7 months ago
- Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Sen…☆30Updated 2 years ago
- ☆40Updated last year
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆69Updated last year
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆57Updated 7 months ago
- Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023☆251Updated last year
- Code for the AAAI 2023 paper "CodeAttack: Code-based Adversarial Attacks for Pre-Trained Programming Language Models☆33Updated 2 years ago
- The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Lang…☆141Updated 2 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆164Updated last year
- Plurals: A System for Guiding LLMs Via Simulated Social Ensembles☆28Updated 2 weeks ago
- repo for the paper titled “CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation”☆14Updated 2 years ago
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆54Updated 2 years ago
- Code and data of the EMNLP 2022 paper "Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversaria…☆61Updated 2 years ago
- 🌟 SwarmAgent: A framework for simulating social group dynamics using multi-agent collaboration, aiding insights into collective behavior…☆12Updated last year
- A repository for the paper "Beliefs about AI influence human-AI interaction and can be manipulated to increase perceived trustworthiness,…☆17Updated 2 years ago
- ☆66Updated 10 months ago
- Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment☆106Updated last year