facebookresearch / BigOBenchLinks
BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated code.
☆40Updated 9 months ago
Alternatives and similar repositories for BigOBench
Users that are interested in BigOBench are comparing it to the libraries listed below
Sorting:
- ☆29Updated 2 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆46Updated 5 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Updated 5 months ago
- Resa: Transparent Reasoning Models via SAEs☆47Updated 4 months ago
- ☆22Updated 7 months ago
- ☆29Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆61Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆116Updated this week
- ☆75Updated last year
- ☆33Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Updated last month
- ☆35Updated 8 months ago
- ☆28Updated 2 months ago
- ☆19Updated 5 months ago
- ☆45Updated 7 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆96Updated 8 months ago
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Updated 3 months ago
- Official repo of paper LM2☆46Updated 11 months ago
- ☆89Updated 3 months ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆64Updated this week
- ☆21Updated 6 months ago
- ☆91Updated last year
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆150Updated 4 months ago
- ☆130Updated this week
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆59Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Updated 5 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Updated 3 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Updated last year
- Exploration of automated dataset selection approaches at large scales.☆52Updated 10 months ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Updated last year