CarperAI / Code-Pile
This repository contains all the code for collecting large scale amounts of code from GitHub.
β105Updated last year
Related projects: β
- β73Updated last year
- π [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswaβ¦β36Updated 10 months ago
- β71Updated last year
- A set of utilities for running few-shot prompting experiments on large-language modelsβ106Updated 10 months ago
- A hard gym for programmingβ136Updated 2 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."β60Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ96Updated 10 months ago
- Code for the paper "Efficient Training of Language Models to Fill in the Middle"β162Updated last year
- β91Updated 5 months ago
- Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023β229Updated 9 months ago
- Experiments with generating opensource language model assistantsβ97Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generationβ42Updated 8 months ago
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)β76Updated last year
- For experiments involving instruct gpt. Currently used for documenting open research questions.β71Updated last year
- Accepted by Transactions on Machine Learning Research (TMLR)β115Updated 8 months ago
- Script for downloading GitHub.β87Updated 2 months ago
- Multi-Domain Expert Learningβ67Updated 7 months ago
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasksβ204Updated 8 months ago
- β47Updated last month
- One stop shop for all things carpβ58Updated 2 years ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluationβ99Updated last month
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fastβ131Updated 2 weeks ago
- Code repository for the c-BTM paperβ105Updated 11 months ago
- Pre-training code for CrystalCoder 7B LLMβ52Updated 4 months ago
- Fine-tune SantaCoder for Code/Text Generation.β182Updated last year
- Camel-Coder: Collaborative task completion with multiple agents. Role-based prompts, intervention mechanism, and thoughtful suggestionsβ33Updated last year
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptionsβ68Updated last year
- β34Updated last month
- β121Updated 10 months ago
- β174Updated last year