CATArena is an engineering-level tournament evaluation platform for Large Language Model-driven code agents (LLM-driven code agents), based on an iterative competitive peer learning framework.
☆64Dec 25, 2025Updated 3 months ago
Alternatives and similar repositories for CATArena
Users that are interested in CATArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] "VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?", Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, L…☆37Jan 30, 2026Updated 2 months ago
- Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning☆27Mar 30, 2026Updated last week
- Multi-step reasoning MLLM☆19Mar 8, 2026Updated last month
- Differential Evolution Algorithm which uses Non-dominated Sorting for Multi-Objective Optimization☆10Mar 11, 2020Updated 6 years ago
- The official code of "Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search"☆27Sep 15, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Introduce a novel Video Trimming (VT) task and proposes an agent-based approach (AVT) for detecting wasted footage, selecting valuable se…☆24Jan 20, 2025Updated last year
- The implementation of “Fine-tuning Graph Neural Networks by Preserving Graph Generative Patterns”☆18Jun 18, 2024Updated last year
- Counterfactual generation of tumor perturbations from multiplexed tissue images☆23May 13, 2025Updated 10 months ago
- Official implementation of the paper: [EMNLP 2025] RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruct…☆21Dec 9, 2025Updated 4 months ago
- [BMVC 2023 Oral] Boost Video Frame Interpolation via Motion Adaptation☆19Aug 22, 2024Updated last year
- ☆42Dec 15, 2025Updated 3 months ago
- A repository for reproducing experiments from the TxPert paper☆25Mar 25, 2026Updated 2 weeks ago
- ICML 2024 Paper "Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies"☆18Jul 10, 2024Updated last year
- Code for InstructBioMol, implementing the Nature Machine Intelligence paper "Advancing Biomolecular Understanding and Design Following Hu…☆30Aug 2, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A multi-objective evolutionary algorithm with interval based initialization and self-adaptive crossover operator for large-scale feature …☆13Sep 6, 2022Updated 3 years ago
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆20Oct 2, 2024Updated last year
- Code for “MEL: Efficient Multi-Task Evolutionary Learning for High-Dimensional Feature Selection“--[IEEE Transactions on Knowledge and Da…☆19Dec 9, 2025Updated 4 months ago
- Robust Principles: Architectural Design Principles for Adversarially Robust CNNs☆24Jan 13, 2024Updated 2 years ago
- ☆33Jul 15, 2025Updated 8 months ago
- 展示 Segment Anything 模型能力的示例项目☆11Jun 18, 2023Updated 2 years ago
- Paper: “MEMRL: SELF-EVOLVING AGENTS VIA RUNTIME REINFORCEMENT LEARNING ON EPISODIC MEMORY” Open-Source Code☆75Feb 27, 2026Updated last month
- Non-linear Motion Estimation for Video Frame Interpolation using Space-time Convolutions☆20Jun 23, 2022Updated 3 years ago
- LLMatic is a 2-archive QD algorithm that uses LLMs to mutate the networks. Tested for Neural Architecture search but can easily be used f…☆20Aug 14, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Implementation of Siamese CBOW using keras whose backend is tensorflow.☆12Feb 2, 2023Updated 3 years ago
- Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder (NeurIPS 2023)☆10Jun 5, 2024Updated last year
- MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs☆41Mar 13, 2026Updated 3 weeks ago
- ☆13Jul 11, 2022Updated 3 years ago
- This repository contains the code for https://decimer.ai☆52Nov 3, 2025Updated 5 months ago
- Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024☆27Nov 13, 2024Updated last year
- 我的一些开源文档☆10Feb 18, 2025Updated last year
- An AI benchmark for Pokémon VGC with agent implementations using multi-agent reinforcement learning, behavior cloning, LLMs, and heuristi…☆38Apr 5, 2026Updated last week
- Follow Me: Conversation Planning for Target-driven Recommendation Dialogue Systems☆11Aug 1, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Improvement for Modular Camera based Tactile Sensor, with integrated circuit, optimized illumination, and biomimetic markers.☆16Feb 14, 2024Updated 2 years ago
- ☆27Feb 13, 2026Updated last month
- 强化学习训练斗地主 / doudizhu AI using reinforcement learning.☆18Sep 19, 2019Updated 6 years ago
- 电子发票(pdf)解析。☆18May 11, 2025Updated 11 months ago
- Official code for our NeurIPS 2024 paper "einspace: Searching for Neural Architectures from Fundamental Operations"☆30Feb 13, 2026Updated last month
- ☆21May 23, 2025Updated 10 months ago
- ☆73Jun 10, 2025Updated 10 months ago