Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena"
☆49Jan 28, 2024Updated 2 years ago
Alternatives and similar repositories for auction-arena
Users that are interested in auction-arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 11 months ago
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning☆36Aug 19, 2023Updated 2 years ago
- [COLM'24] How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?☆23Oct 13, 2024Updated last year
- ☆18May 17, 2025Updated last year
- [TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling D…☆25Apr 30, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Dataset Pinocchio for paper "Towards Understanding Factual Knowledge of Large Language Models" accepted by ICLR 2024 (Spotlight)☆12Mar 13, 2024Updated 2 years ago
- ☆98Dec 5, 2023Updated 2 years ago
- ☆21Mar 19, 2024Updated 2 years ago
- A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…☆19Oct 4, 2024Updated last year
- Count Tokens of Code (forked from gocloc)☆45Aug 19, 2024Updated last year
- ☆10Oct 17, 2021Updated 4 years ago
- A probabilistic CKY parser for PCFGs☆19Mar 12, 2014Updated 12 years ago
- ☆76May 23, 2024Updated 2 years ago
- My personal response to OpenAI's Grant Challenge☆29Jun 13, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- XmodelLM☆38Nov 19, 2024Updated last year
- Code and data for experiments on semantic fragments☆11Jun 23, 2022Updated 3 years ago
- ☆10Dec 14, 2020Updated 5 years ago
- Problem-Oriented Segmentation and Retrieval EMNLP 2024 Findings☆34Nov 12, 2024Updated last year
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- HELP: a dataset for Handling Entailments with Lexical and logical Phenomena (Ver.1.0)☆15Jul 20, 2023Updated 2 years ago
- Neural machine translation implementation using dynet's python bindings☆17Jan 24, 2018Updated 8 years ago
- LATTICE turns retrieval into an LLM-driven navigation problem over a semantic scaffold☆37Mar 9, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- CodeGPT: A Code-Related Dialogue Dataset Generated by GPT and for GPT☆114Jun 16, 2023Updated 3 years ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Jul 3, 2023Updated 2 years ago
- SHUbeamer是为了帮助上海大学师生撰写演示文稿而编写的LaTex Beamer模版文件☆10Dec 1, 2021Updated 4 years ago
- Code for EMNLP-2018 paper "Variational Autoregressive Decoder for Neural Response Generation"☆16Oct 11, 2019Updated 6 years ago
- The official repository of the paper "X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation"☆13Jan 22, 2025Updated last year
- ☆17Dec 11, 2023Updated 2 years ago
- Repo for "Centaur: Robust Multimodal Fusion for Human Activity Recognition"☆10Jan 9, 2024Updated 2 years ago
- Independent implementation of DBCA method from http://arxiv.org/abs/1912.09713☆11Nov 25, 2020Updated 5 years ago
- Microsoft Complex Tasks Dataset☆17Jun 12, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆28Aug 23, 2025Updated 9 months ago
- 📄 Evidence Retrieval and Claim Verification for the FEVER shared task using Transformer Networks☆12Feb 21, 2020Updated 6 years ago
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatio…☆45Dec 6, 2025Updated 6 months ago
- 丢小墙小程序项目,使用腾讯云开发☆11Dec 10, 2022Updated 3 years ago
- LLM play 20questions with itself☆13Mar 31, 2023Updated 3 years ago
- ☆10Mar 19, 2024Updated 2 years ago