LearningOpt / pie
☆50Updated 8 months ago
Alternatives and similar repositories for pie:
Users that are interested in pie are comparing it to the libraries listed below
- Training language models to make programs faster☆87Updated 11 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆133Updated 5 months ago
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆109Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆64Updated 7 months ago
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆30Updated 8 months ago
- SatLM: SATisfiability-Aided Language Models using Declarative Prompting (NeurIPS 2023)☆48Updated 8 months ago
- Code and data for XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence☆70Updated 2 months ago
- xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval☆78Updated 6 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆106Updated 5 months ago
- Code for ICML 2021 paper: How could Neural Networks understand Programs?☆123Updated 4 months ago
- ☆33Updated last year
- ☆59Updated 10 months ago
- ☆115Updated 8 months ago
- Reproducing R1 for Code with Reliable Rewards☆140Updated 3 weeks ago
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆70Updated 8 months ago
- r2e: turn any github repository into a programming agent environment☆107Updated 3 weeks ago
- ComPy-Learn is a framework for exploring program representations for ML4CODE tasks.☆23Updated last year
- ☆107Updated 8 months ago
- Tzer: TVM Implementation of "Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation (OOPSLA'22)“.☆70Updated 2 years ago
- DafnyBench: A Benchmark for Formal Software Verification☆31Updated 3 months ago
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Updated 2 years ago
- Automatic DNN generation for fuzzing and more☆129Updated 2 months ago
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆136Updated 8 months ago
- Utilities for constructing a large dataset of LLVM IR☆18Updated 7 months ago
- [EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code☆72Updated 9 months ago
- A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.☆51Updated 5 months ago
- Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings☆92Updated this week
- EvoEval: Evolving Coding Benchmarks via LLM☆68Updated 11 months ago
- [FSE-2024] Towards AI-Assisted Synthesis of Verified Dafny Methods☆42Updated 9 months ago
- Making code edting up to 7.7x faster using multi-layer speculation☆19Updated last month