Interface for GenAI-Arena [NeurIPS24]
☆17Feb 27, 2024Updated 2 years ago
Alternatives and similar repositories for GenAI-Arena
Users that are interested in GenAI-Arena are comparing it to the libraries listed below
Sorting:
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- Landing page + leaderboard for SWE-Bench benchmark☆11Updated this week
- ☆45Jan 21, 2026Updated last month
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated 10 months ago
- Demo for advanced Java final project in 18-19 1 of Canghong Jin☆25Nov 18, 2018Updated 7 years ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆33Jun 30, 2025Updated 8 months ago
- ☆13Nov 21, 2025Updated 3 months ago
- Your command-line, context-aware chatbot for instant codebase insights & more ✨☆16May 30, 2024Updated last year
- Repository of IPBench☆19Jan 4, 2026Updated last month
- ☆43Feb 11, 2025Updated last year
- Base mech☆39Feb 20, 2026Updated last week
- GBM implementation on Legate☆14Jan 28, 2026Updated last month
- True Few-Shot BioIE: Benchmarking GPT-3 In-Context and Small PLM Fine-Tuning☆12Jul 6, 2022Updated 3 years ago
- ☆11May 24, 2024Updated last year
- Official implementation of "Imaginarium: Vision-guided High-quality 3D Scene Layout Generation"☆41Dec 30, 2025Updated 2 months ago
- Project developed for AI Launch Lab's R&D program. TradeMind is a Machine Learning Stock Analysis tool aimed to give you more confidence …☆13Aug 26, 2024Updated last year
- Luxonis ML library which abstracts logging, tracking, and other useful functionalities.☆17Updated this week
- Kernel Playground - A playground to run large scale experiments on the Linux Kernel☆17Nov 8, 2025Updated 3 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- Kernel CLI☆13Updated this week
- An autonomous service implementing a decentralized Impact Evaluator☆13Dec 1, 2025Updated 3 months ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 4 months ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- Middleware and macros/ui extensions to control smart buildings with Webex devices☆20Jul 31, 2025Updated 7 months ago
- ☆11Jul 17, 2023Updated 2 years ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- Customizable progressive web application for dynamic indexing of Kubernetes Ingress resources.☆17Jan 7, 2026Updated last month
- ☆12Jan 11, 2026Updated last month
- ☆12Oct 17, 2025Updated 4 months ago
- A benchmark designed to evaluate visualization generation methods.☆57Nov 4, 2025Updated 3 months ago
- SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions☆67Apr 15, 2025Updated 10 months ago
- ☆43Aug 15, 2023Updated 2 years ago
- ☆14Sep 23, 2024Updated last year
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization"☆10Dec 24, 2024Updated last year
- Faster version of AugShuffleNet without channel shuffle, computes partially, crossovers swiftly☆11Feb 17, 2025Updated last year
- Remote Components demo using Next.js App Router apps☆25Dec 9, 2025Updated 2 months ago
- Shaping Language Models with Cognitive Insights☆15Feb 29, 2024Updated 2 years ago
- The source code and the data for ACL 2022 paper "Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Dat…☆14Apr 21, 2023Updated 2 years ago