uiuc-focal-lab / llm-priming-attacks
☆15Updated last year
Alternatives and similar repositories for llm-priming-attacks
Users that are interested in llm-priming-attacks are comparing it to the libraries listed below
Sorting:
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆137Updated 7 months ago
- ☆73Updated last year
- RepoQA: Evaluating Long-Context Code Understanding☆108Updated 6 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆102Updated last year
- Efficient and general syntactical decoding for Large Language Models☆265Updated this week
- EvoEval: Evolving Coding Benchmarks via LLM☆70Updated last year
- ☆37Updated 3 months ago
- Iterate on LLM-based structured generation forward and backward☆15Updated last month
- ☆84Updated last year
- A certifier for bias in LLMs☆24Updated last month
- FANC is a tool for the proof transfer of incomplete verification☆11Updated 3 years ago
- [NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation☆307Updated 2 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 11 months ago
- Model REVOLVER, a human in the loop model mixing system.☆33Updated last year
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆68Updated last year
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 2 months ago
- r2e: turn any github repository into a programming agent environment☆119Updated 3 weeks ago
- Enhancing AI Software Engineering with Repository-level Code Graph☆164Updated last month
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆31Updated 10 months ago
- Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions☆42Updated 9 months ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆48Updated 3 weeks ago
- LLM Program Watermarking☆17Updated last year
- ☆131Updated last month
- ☆81Updated 4 months ago
- ☆32Updated last week
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆217Updated last year
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆35Updated last year
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆45Updated 4 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆67Updated 6 months ago
- A preprint version of our recent research on the capability of frontier AI systems to do self-replication☆59Updated 4 months ago