APEXLAB / CodeApexLinks
☆49Updated last year
Alternatives and similar repositories for CodeApex
Users that are interested in CodeApex are comparing it to the libraries listed below
Sorting:
- NaturalCodeBench (Findings of ACL 2024)☆65Updated 8 months ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆65Updated 9 months ago
- ☆41Updated 6 months ago
- xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval☆83Updated 9 months ago
- ☆47Updated last year
- ☆82Updated last year
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆36Updated last year
- ☆142Updated 11 months ago
- Feeling confused about super alignment? Here is a reading list☆42Updated last year
- Reproducing R1 for Code with Reliable Rewards☆221Updated last month
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- ☆46Updated this week
- Official repository for our paper "FullStack Bench: Evaluating LLMs as Full Stack Coders"☆92Updated last month
- A Comprehensive Benchmark for Software Development.☆110Updated last year
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆113Updated last year
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆68Updated last month
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆58Updated last year
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated last year
- ☆31Updated last week
- Advancing LLM with Diverse Coding Capabilities☆73Updated 11 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆61Updated 6 months ago
- 代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota☆42Updated 11 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆130Updated last year
- ☆55Updated last week
- ☆21Updated 2 months ago
- The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…☆38Updated 11 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆145Updated 8 months ago
- Collection of papers for scalable automated alignment.☆91Updated 8 months ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆40Updated last year
- Accepted by Transactions on Machine Learning Research (TMLR)☆128Updated 8 months ago