locuslab / llm-idiosyncrasiesLinks

Code release for "Idiosyncrasies in Large Language Models"

☆45

Alternatives and similar repositories for llm-idiosyncrasies

Users that are interested in llm-idiosyncrasies are comparing it to the libraries listed below

Sorting:

ahans30 / goldfish-loss
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆92Updated 10 months ago
zzwjames / FailureLLMUnlearning
An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)
☆32Updated 7 months ago
sail-sg / Cheating-LLM-Benchmarks
[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
☆83Updated 11 months ago
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆136Updated 3 months ago
katiekang1998 / reasoning_generalization
☆33Updated 9 months ago
JonasGeiping / carving
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
☆69Updated last year
LukeBailey181 / obfuscated-activations
Codebase for Obfuscated Activations Bypass LLM Latent-Space Defenses
☆24Updated 8 months ago
collinzrj / output2prompt
☆46Updated 7 months ago
JoshEngels / SAE-Probes
Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
☆29Updated 6 months ago
AlexCuadron / ThinkingAgent
Systematic evaluation framework that automatically rates overthinking behavior in large language models.
☆93Updated 4 months ago
g-luo / vlm_cross_modal_reps
Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025
☆31Updated 5 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆120Updated 6 months ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆213Updated 8 months ago
clinicalml / co-llm
Co-LLM: Learning to Decode Collaboratively with Multiple Language Models
☆121Updated last year
facebookresearch / AbstentionBench
A holistic benchmark for LLM abstention
☆53Updated last month
OpenStellarTeam / DeltaBench
☆44Updated 7 months ago
sail-sg / Rigging-ChatbotArena
Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)
☆22Updated 7 months ago
MadryLab / DsDm
☆50Updated last year
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆81Updated 10 months ago
mcleish7 / gemstone-scaling-laws
Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)
☆29Updated 2 weeks ago
csinva / tree-prompt
Tree prompting: easy-to-use scikit-learn interface for improved prompting.
☆40Updated last year
azshue / AutoPoison
The official repository of the paper "On the Exploitability of Instruction Tuning".
☆65Updated last year
rdi-berkeley / awesome-RLVR-boundary
A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…
☆57Updated this week
Luckfort / CD
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
☆80Updated 8 months ago
JinjieNi / dlms-are-super-data-learners
The official github repo for "Diffusion Language Models are Super Data Learners".
☆129Updated last week
hughbzhang / o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
☆90Updated 10 months ago
JacobPfau / fillerTokens
☆72Updated last year
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆25Updated 10 months ago
Wang-ML-Lab / multimodal-needle-in-a-haystack
[NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models
☆49Updated 5 months ago
casmlab / NPHardEval
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆59Updated last year