bigcode-project / bigcode-encoder
☆29Updated last year
Related projects ⓘ
Alternatives and complementary repositories for bigcode-encoder
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆44Updated 10 months ago
- Repository for analysis and experiments in the BigCode project.☆115Updated 8 months ago
- ☆48Updated 3 months ago
- ☆75Updated last year
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆79Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences☆65Updated 4 months ago
- ☆38Updated 7 months ago
- A repository for transformer critique learning and generation☆86Updated 11 months ago
- ☆101Updated 4 months ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated last year
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 7 months ago
- ☆31Updated last year
- xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval☆74Updated 2 months ago
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆35Updated 11 months ago
- [EMNLP 2023 Industry Track] A simple prompting approach that enables the LLMs to run inference in batches.☆69Updated 8 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆100Updated 2 weeks ago
- Accepted by Transactions on Machine Learning Research (TMLR)☆119Updated last month
- Code for generating the JuICe dataset.☆37Updated 3 years ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated 7 months ago
- Retrieval Augmented Generation Generalized Evaluation Dataset☆51Updated this week
- ☆73Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆41Updated 9 months ago
- ☆54Updated 6 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆48Updated 7 months ago
- ☆71Updated 6 months ago
- evol augment any dataset online☆55Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆115Updated last month
- Script for downloading GitHub.☆88Updated 4 months ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆76Updated 7 months ago