☆70Feb 9, 2026Updated 3 weeks ago
Alternatives and similar repositories for NL2RepoBench
Users that are interested in NL2RepoBench are comparing it to the libraries listed below
Sorting:
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 6 months ago
- Comprehensive GPU specifications database with 2,824 GPUs across NVIDIA, AMD, and Intel☆59Jan 7, 2026Updated last month
- Public repository for the Remote Labor Index (RLI)☆61Nov 3, 2025Updated 4 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆248Jul 13, 2025Updated 7 months ago
- ☆47Oct 28, 2025Updated 4 months ago
- Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours☆166Updated this week
- BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated c…☆40Apr 15, 2025Updated 10 months ago
- This is a framework for evaluating reasoning in foundational Video Models.☆74Feb 24, 2026Updated last week
- The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).☆31Feb 24, 2026Updated last week
- ☆10Aug 7, 2024Updated last year
- Numbeo Unofficial API☆15Oct 16, 2022Updated 3 years ago
- This repo is meant to serve as a guide for Machine Learning/AI technical interviews.☆11Mar 5, 2024Updated 2 years ago
- A First Look at Conventional Commits Classification☆12Nov 18, 2024Updated last year
- ☆36Sep 6, 2024Updated last year
- Easy Setup, File-based, Offline Capable Federated Learning and Computations☆22Feb 11, 2026Updated 3 weeks ago
- [NeurIPS '25] GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents☆68Feb 26, 2026Updated last week
- The official implementation of "ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering"☆56Jun 21, 2025Updated 8 months ago
- ☆46Mar 4, 2025Updated last year
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- ☆13Jan 14, 2026Updated last month
- A plugin for OpenCode. Make your coding agent learn and grow with every task.☆36Jan 31, 2026Updated last month
- ☆20May 24, 2025Updated 9 months ago
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 4 months ago
- Open Source Multivalue String Database☆13Feb 16, 2026Updated 2 weeks ago
- EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU☆50Oct 6, 2024Updated last year
- ☆59May 21, 2025Updated 9 months ago
- Official repository for our paper "FullStack Bench: Evaluating LLMs as Full Stack Coders"☆113May 7, 2025Updated 9 months ago
- Attempt of Operating System (C++)☆22Oct 6, 2025Updated 5 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆56Mar 27, 2025Updated 11 months ago
- Kidash: A GrimoireLab tool & library to manage Kibana/Kibiter visualizations and dashboards☆13Updated this week
- An ontology of imaging and related techniques and technologies, image processing and analysis, image data and formats, within bio- and ot…☆12Oct 26, 2025Updated 4 months ago
- A curated collection of my agent-skills☆25Jan 25, 2026Updated last month
- ☆13May 17, 2025Updated 9 months ago
- ☆11Mar 15, 2024Updated last year
- Software for building the IR Anthology.☆11Sep 19, 2023Updated 2 years ago
- A collection of high-performance, modular utilities for enhancing testing, transactional consistency, efficiency, security and stability …☆28Jan 26, 2026Updated last month
- Modern utility library and typescript typings for building JSON Schema documents☆14Nov 28, 2025Updated 3 months ago
- Meta-repo for the Nowcasting project.☆10Oct 15, 2024Updated last year
- ☆13May 28, 2024Updated last year