SalesforceAIResearch / indict_code_genLinks
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
☆14Updated 2 months ago
Alternatives and similar repositories for indict_code_gen
Users that are interested in indict_code_gen are comparing it to the libraries listed below
Sorting:
- Training and Benchmarking LLMs for Code Preference.☆37Updated last year
- Codebase for Inference-Time Policy Adapters☆25Updated 2 years ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆74Updated last year
- ☆44Updated 9 months ago
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆39Updated 6 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆84Updated last year
- ☆33Updated this week
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆90Updated 2 years ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆63Updated last year
- RepoQA: Evaluating Long-Context Code Understanding☆128Updated last year
- ☆22Updated 7 months ago
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆64Updated last year
- [COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees☆31Updated 6 months ago
- ☆85Updated last year
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆118Updated 2 years ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆124Updated last year
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆163Updated 7 months ago
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆35Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆165Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆124Updated last year
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆50Updated last year
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆63Updated last year
- Pseudo-code Instructions dataset☆27Updated 2 years ago
- Bayesian scaling laws for in-context learning.☆15Updated 10 months ago
- ☆41Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Updated 2 years ago
- Replicating O1 inference-time scaling laws☆92Updated last year
- ☆20Updated last year
- ☆119Updated last year
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆85Updated last year