openai / code-align-evals-dataLinks

☆28

Alternatives and similar repositories for code-align-evals-data

Users that are interested in code-align-evals-data are comparing it to the libraries listed below

Sorting:

openai / human-eval-infilling
Code for the paper "Efficient Training of Language Models to Fill in the Middle"
☆188Updated 2 years ago
EleutherAI / lm_perplexity
☆158Updated 4 years ago
bigcode-project / bigcode-analysis
Repository for analysis and experiments in the BigCode project.
☆124Updated last year
niansong1996 / lever
Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)
☆90Updated 2 years ago
salesforce / jaxformer
Minimal library to train LLMs on TPU in JAX with pjit().
☆298Updated last year
princeton-nlp / intercode
[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898
☆227Updated last year
tomekkorbak / pretraining-with-human-feedback
Code accompanying the paper Pretraining Language Models with Human Preferences
☆180Updated last year
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆209Updated last year
reddy-lab-code-research / PPOCoder
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"
☆116Updated last year
allenai / catwalk
This project studies the performance and robustness of language models and task-adaptation methods.
☆154Updated last year
GammaTauAI / leetcode-hard-gym
A hard gym for programming
☆161Updated last year
CarperAI / Code-Pile
This repository contains all the code for collecting large scale amounts of code from GitHub.
☆109Updated 2 years ago
Zyq-scut / RLTF
Accepted by Transactions on Machine Learning Research (TMLR)
☆132Updated last year
zorazrw / odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆49Updated last year
anthropics / ConstitutionalHarmlessnessPaper
☆242Updated 2 years ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆226Updated 2 years ago
rowanz / hellaswag
HellaSwag: Can a Machine _Really_ Finish Your Sentence?
☆220Updated 5 years ago
CarperAI / InstructGPT
For experiments involving instruct gpt. Currently used for documenting open research questions.
☆70Updated 2 years ago
bigcode-project / the-stack-v2
Code for the curation of The Stack v2 and StarCoder2 training data
☆117Updated last year
bhargaviparanjape / language-programmes
☆173Updated 2 years ago
orhonovich / unnatural-instructions
☆179Updated 2 years ago
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆217Updated 2 years ago
EleutherAI / stackexchange-dataset
Python tools for processing the stackexchange data dumps into a text dataset for Language Models
☆82Updated last year
reasoning-machines / prompt-lib
A set of utilities for running few-shot prompting experiments on large-language models
☆123Updated last year
CarperAI / autocrit
A repository for transformer critique learning and generation
☆88Updated last year
amazon-science / mxeval
☆111Updated last year
ntunlp / xCodeEval
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
☆86Updated last year
facebookresearch / cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆154Updated last year
shunzh / Code-AI-Tree-Search
☆120Updated last year
allenai / Lila
A unified benchmark for math reasoning
☆88Updated 2 years ago