Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena"
☆49Jan 28, 2024Updated 2 years ago
Alternatives and similar repositories for auction-arena
Users that are interested in auction-arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 11 months ago
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning☆36Aug 19, 2023Updated 2 years ago
- ☆25Sep 19, 2023Updated 2 years ago
- Code for the paper "Critical Thinking for Language Models"☆13Jun 1, 2021Updated 4 years ago
- The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)☆15Aug 12, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Create and share easy-to-make, built-to-last, innovative, and customizable experiences☆33Feb 21, 2024Updated 2 years ago
- ☆98Dec 5, 2023Updated 2 years ago
- A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…☆19Oct 4, 2024Updated last year
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆70Jun 29, 2024Updated last year
- Count Tokens of Code (forked from gocloc)☆45Aug 19, 2024Updated last year
- ☆10Oct 17, 2021Updated 4 years ago
- Code for our EMNLP 2022 paper: Generative Entity Typing with Curriculum Learning.☆13Aug 19, 2023Updated 2 years ago
- ☆76May 23, 2024Updated 2 years ago
- Code for "Neural Retrievers are Biased Towards LLM-Generated Content"☆14Oct 18, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- XmodelLM☆38Nov 19, 2024Updated last year
- Jupyter notebook that generates useful graphs and statistical metrics for analyzing and improving glycemic control.☆13Feb 5, 2025Updated last year
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.☆360Dec 3, 2025Updated 5 months ago
- Problem-Oriented Segmentation and Retrieval EMNLP 2024 Findings☆34Nov 12, 2024Updated last year
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- HELP: a dataset for Handling Entailments with Lexical and logical Phenomena (Ver.1.0)☆15Jul 20, 2023Updated 2 years ago
- Neural machine translation implementation using dynet's python bindings☆17Jan 24, 2018Updated 8 years ago
- Rethinking Propagation for Unsupervised Graph Domain Adaptation (AAAI-24)☆18Jul 18, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- The official repository of the paper "X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation"☆13Jan 22, 2025Updated last year
- ☆17Dec 11, 2023Updated 2 years ago
- Independent implementation of DBCA method from http://arxiv.org/abs/1912.09713☆11Nov 25, 2020Updated 5 years ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆27Aug 23, 2025Updated 9 months ago
- 📄 Evidence Retrieval and Claim Verification for the FEVER shared task using Transformer Networks☆12Feb 21, 2020Updated 6 years ago
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatio…☆45Dec 6, 2025Updated 5 months ago
- ☆11Jul 10, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 丢小墙小程序项目,使用腾讯云开发☆11Dec 10, 2022Updated 3 years ago
- LLM-based Multi-dimensional Debate Judge with Iterative Chronological Analysis☆20Oct 1, 2025Updated 7 months ago
- ☆14Jul 17, 2024Updated last year
- ☆10Mar 19, 2024Updated 2 years ago
- [ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"☆515Updated this week
- Building on the MLFlow toolset this project aims to extend the functionality for MLFlow, increase the automation and therefore reduce the…☆14Apr 2, 2023Updated 3 years ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆30Aug 9, 2025Updated 9 months ago