cometeme / funcoderLinks
Implementation for NeurIPS 2024 oral paper: Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
☆12Updated 7 months ago
Alternatives and similar repositories for funcoder
Users that are interested in funcoder are comparing it to the libraries listed below
Sorting:
- GenRM-CoT: Data release for verification rationales☆65Updated 10 months ago
- Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆99Updated last week
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆190Updated this week
- A Sober Look at Language Model Reasoning☆81Updated 2 months ago
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆38Updated 6 months ago
- Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples☆43Updated last month
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆113Updated 8 months ago
- ☆204Updated 5 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆86Updated last year
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆77Updated 2 years ago
- ☆33Updated 11 months ago
- ☆163Updated 3 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆147Updated 6 months ago
- ☆74Updated 9 months ago
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆20Updated 2 years ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆123Updated 11 months ago
- ☆66Updated 4 months ago
- ☆48Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆189Updated 4 months ago
- OpenReivew Submission Visualization (ICLR 2024/2025)☆151Updated 10 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆41Updated 2 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆162Updated 5 months ago
- ☆120Updated 5 months ago
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆37Updated 3 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆208Updated last year
- This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers…☆88Updated 9 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆60Updated last month
- [COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆124Updated 2 weeks ago
- [ICML 2024] Self-Infilling Code Generation☆18Updated last year
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆29Updated 9 months ago