π§ Accepting Task Submissions π§
β73Mar 17, 2026Updated this week
Alternatives and similar repositories for terminal-bench-3
Users that are interested in terminal-bench-3 are comparing it to the libraries listed below
Sorting:
- Convert GitHub PRs into Harbor tasksβ48Mar 10, 2026Updated last week
- β17Apr 11, 2025Updated 11 months ago
- Verifiers for LLM Reinforcement Learningβ80Apr 15, 2025Updated 11 months ago
- Language models scale reliably with over-training and on downstream tasksβ100Apr 2, 2024Updated last year
- Synthetic Data Generation for Evaluationβ13Feb 21, 2025Updated last year
- Annotated sequence dataβ11Feb 2, 2025Updated last year
- β35Jan 25, 2026Updated last month
- An algorithm for classification from a graph-sparse supportβ15Jan 30, 2019Updated 7 years ago
- Host CIFAR-10.2 Data Setβ13Sep 22, 2021Updated 4 years ago
- β21Jul 25, 2024Updated last year
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequβ¦β28Sep 20, 2025Updated 6 months ago
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.β13Nov 17, 2020Updated 5 years ago
- β11Oct 26, 2022Updated 3 years ago
- β11Jan 26, 2020Updated 6 years ago
- β13Jul 28, 2023Updated 2 years ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agentsβ597Updated this week
- η»ε½θζ¬β12Nov 4, 2022Updated 3 years ago
- Lightly-reviewed collection of community environmentsβ219Mar 12, 2026Updated last week
- β72Mar 2, 2026Updated 2 weeks ago
- An automated data pipeline scaling RL to pretraining levelsβ74Oct 11, 2025Updated 5 months ago
- Source code and additional results for GLOD issuesβ12Jan 19, 2023Updated 3 years ago
- A high-throughput oblivious storage systemβ28May 31, 2023Updated 2 years ago
- Data creation, training and eval scripts for the IRCoder paperβ20May 31, 2024Updated last year
- β15Jun 12, 2024Updated last year
- β24Oct 6, 2025Updated 5 months ago
- β17Mar 5, 2022Updated 4 years ago
- Run SWE-bench evaluations remotelyβ60Aug 14, 2025Updated 7 months ago
- Give your dependencies stars on GitHub! πβ18May 1, 2021Updated 4 years ago
- Fast Topological Clustering with Wasserstein Distance (ICLR 2022)β12Jun 24, 2022Updated 3 years ago
- In Japanese. Juliaγ§ε¦γΆγΏγ€γγγ€γ³γγ£γ³γ°ζ¨‘εγ¨γγγγΈγ«γ«η©θ³ͺβ15Mar 21, 2022Updated 4 years ago
- [ACL 2024] DiFiNet: Boundary-Aware Semantic Differentiation and Filtration Network for Nested Named Entity Recognitionβ16Oct 2, 2024Updated last year
- Benchmarking approximate nearest neighbors. Note: This is an archived version from our SISAP 2017 paper, see below.β28May 3, 2018Updated 7 years ago
- SCOPE: Self-evolving Context Optimization via Prompt Evolution - A framework for automatic prompt optimizationβ70Dec 18, 2025Updated 3 months ago
- Asynchronous Programming in Rust ζ₯ζ¬θͺηβ14Aug 16, 2022Updated 3 years ago
- Source Code for Graph Anomaly Detection with Unsupervised GNNs (ICDM2022)β12Oct 18, 2022Updated 3 years ago
- MCP DeepResearch Server: εΊδΊ LangGraph + Ollama + Tavily ηζ·±εΊ¦η η©Άζε‘ε¨οΌζ―ζεΌζ₯θΏθ‘γθΆ ζΆζ§εΆδΈθΏεΊ¦ζ¨ιβ31Jun 16, 2025Updated 9 months ago
- Source code for the backend of ocwcentral.comβ18May 5, 2024Updated last year
- BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution