facebookresearch / BigOBenchLinks
BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated code.
☆35Updated 3 months ago
Alternatives and similar repositories for BigOBench
Users that are interested in BigOBench are comparing it to the libraries listed below
Sorting:
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 10 months ago
- ☆33Updated 2 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆33Updated 4 months ago
- ☆55Updated 3 weeks ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆91Updated 2 months ago
- ☆47Updated 5 months ago
- Official repo of paper LM2☆41Updated 5 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated last week
- ☆23Updated last month
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆58Updated last year
- Resa: Transparent Reasoning Models via SAEs☆40Updated last month
- Training and Benchmarking LLMs for Code Preference.☆33Updated 8 months ago
- NeurIPS 2024 tutorial on LLM Inference☆45Updated 7 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆38Updated 4 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆63Updated 3 months ago
- The repository contains code for Adaptive Data Optimization☆25Updated 7 months ago
- ☆19Updated 4 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆27Updated 4 months ago
- Codebase for Instruction Following without Instruction Tuning☆35Updated 9 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆19Updated 4 months ago
- ☆82Updated 11 months ago
- ☆34Updated 3 weeks ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆35Updated last year
- ☆27Updated 6 months ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆90Updated 8 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated last year
- ☆33Updated 6 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆47Updated 3 months ago
- Process Reward Models That Think☆46Updated 2 weeks ago
- ☆20Updated last year