microsoft / DataScienceProblems
A repository containing the Jupyter notebook code generation benchmark.
☆59Updated 3 years ago
Alternatives and similar repositories for DataScienceProblems:
Users that are interested in DataScienceProblems are comparing it to the libraries listed below
- Official code release for the paper Coder Reviewer Reranking for Code Generation.☆43Updated 2 years ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆47Updated last year
- Code for generating the JuICe dataset.☆37Updated 3 years ago
- ☆53Updated last year
- Code Generator☆23Updated 2 years ago
- ☆115Updated 9 months ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆25Updated 2 years ago
- PROSE Public Benchmark Suite☆25Updated 6 months ago
- Code, datasets and results of the ChatGPT evaluation presented in paper "ChatGPT: Jack of all trades, master of none"☆29Updated 2 years ago
- ☆75Updated last month
- Code for our paper: "GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models"☆53Updated 2 years ago
- Graph4Tree is a simple example code for our EMNLP'20 Findings paper idea.☆26Updated 4 years ago
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆86Updated last year
- ☆46Updated last year
- [EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code☆73Updated 10 months ago
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆71Updated 2 years ago
- ☆29Updated last year
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆59Updated 6 months ago
- Repository for Decomposed Prompting☆90Updated last year
- Web queries dataset for code search☆32Updated last year
- Code for the NLP4Prog workshop paper "Reading StackOverflow Encourages Cheating: Adding Question TextImproves Extractive Code Generation"☆21Updated 3 years ago
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆42Updated last month
- Weakly Supervised Text-to-SQL Parsing through Question Decomposition☆22Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 3 years ago
- Training language models to make programs faster☆87Updated last year
- Dataset and code for Findings of EMNLP'21 paper "CodeQA: A Question Answering Dataset for Source Code Comprehension".☆42Updated last year
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆19Updated 10 months ago
- ☆51Updated last month
- ☆24Updated 5 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated last year