APEXLAB / CodeApex
☆49Updated last year
Related projects: ⓘ
- A Comprehensive Benchmark for Software Development.☆84Updated 3 months ago
- AI Alignment: A Comprehensive Survey☆123Updated 10 months ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆30Updated 4 months ago
- ☆112Updated 4 months ago
- ☆52Updated 2 months ago
- ☆76Updated 4 months ago
- ☆82Updated 5 months ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆56Updated 7 months ago
- GAOGAO-Bench-2023 is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.☆17Updated 9 months ago
- The Code Repo for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization☆86Updated 2 weeks ago
- ☆75Updated 5 months ago
- Achieving Efficient Alignment through Learned Correction☆103Updated 3 months ago
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆52Updated 2 months ago
- Token level visualization tools for large language models☆46Updated last month
- ☆124Updated 2 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆65Updated 11 months ago
- Feeling confused about super alignment? Here is a reading list☆42Updated 8 months ago
- ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆63Updated 5 months ago
- A reading list on LLM based Synthetic Data Generation 🔥☆105Updated last month
- ☆49Updated 6 months ago
- CodeRAG-Bench: Can Retrieval Augment Code Generation?☆54Updated 2 months ago
- ☆79Updated 5 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆21Updated 3 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆61Updated 2 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆38Updated 2 months ago
- [ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning☆162Updated 5 months ago
- xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval☆70Updated 8 months ago
- Official implementation of paper How to Understand Whole Repository? New SOTA on SWE-bench Lite (21.3%)☆55Updated 3 months ago
- ⏳ ChatLog: Recording and Analysing ChatGPT Across Time☆94Updated 3 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆87Updated 3 weeks ago