☆37Oct 15, 2024Updated last year
Alternatives and similar repositories for KnowledgeSpread
Users that are interested in KnowledgeSpread are comparing it to the libraries listed below
Sorting:
- ☆23Oct 11, 2024Updated last year
- Reproduction Code for Paper "Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models"☆13Jun 1, 2024Updated last year
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆11Jan 3, 2024Updated 2 years ago
- ☆26Jun 28, 2025Updated 8 months ago
- ☆52Feb 8, 2025Updated last year
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- ☆17Feb 26, 2024Updated 2 years ago
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆47Oct 13, 2025Updated 4 months ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated last year
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆28May 28, 2024Updated last year
- ☆26Jun 5, 2024Updated last year
- ☆12Sep 14, 2021Updated 4 years ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆190Mar 22, 2024Updated last year
- ☆105Aug 11, 2025Updated 6 months ago
- Agent Security Bench (ASB)☆186Oct 27, 2025Updated 4 months ago
- A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark☆32Feb 20, 2026Updated last week
- Methods and evaluation for aligning language models temporally☆30Mar 2, 2024Updated 2 years ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- This the implementation of LeCo☆31Jan 20, 2025Updated last year
- ☆11May 25, 2023Updated 2 years ago
- [ACL 2024] Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models☆41Jun 4, 2024Updated last year
- Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT☆36Oct 15, 2023Updated 2 years ago
- ☆164Sep 2, 2024Updated last year
- Code and data for the paper: Competing Large Language Models in Multi-Agent Gaming Environments☆95Jan 26, 2026Updated last month
- FGLA: Fast Generation-Based Gradient Leakage Attacks against Highly Compressed Gradients☆14Dec 20, 2022Updated 3 years ago
- ☆11Apr 6, 2019Updated 6 years ago
- ☆12Jan 11, 2026Updated last month
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k☆11Mar 14, 2024Updated last year
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- ☆16Nov 8, 2024Updated last year
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆109Sep 27, 2024Updated last year
- [ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"☆38Jul 12, 2024Updated last year
- ☆49Aug 6, 2024Updated last year
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10May 6, 2024Updated last year
- This repository contains reference implementation for multi-LLM ToM paper (accepted to EMNLP 2023), Theory of Mind for Multi-Agent Collab…☆18Jun 11, 2024Updated last year