locuslab / llm-idiosyncrasiesLinks
Code release for "Idiosyncrasies in Large Language Models"
☆52Updated 5 months ago
Alternatives and similar repositories for llm-idiosyncrasies
Users that are interested in llm-idiosyncrasies are comparing it to the libraries listed below
Sorting:
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆161Updated 6 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆70Updated last year
- ☆45Updated 10 months ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Updated last year
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆85Updated last year
- ☆86Updated 11 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆41Updated last year
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆82Updated 11 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆132Updated last year
- AnchorAttention: Improved attention for LLMs long-context training☆213Updated 11 months ago
- The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution☆204Updated this week
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆147Updated last year
- ☆150Updated 2 years ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆162Updated last month
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆35Updated 10 months ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆73Updated 8 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆34Updated last year
- ☆54Updated 9 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆123Updated last year
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆26Updated 10 months ago
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆50Updated last year
- ☆51Updated last year
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆117Updated last year
- Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"☆31Updated 9 months ago
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆48Updated last year
- AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies☆28Updated last year
- Functional Benchmarks and the Reasoning Gap☆90Updated last year
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆55Updated 2 months ago
- ☆110Updated 8 months ago
- Exploration of automated dataset selection approaches at large scales.☆53Updated 10 months ago