[NeurIPS 2024] GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
☆69Sep 6, 2024Updated last year
Alternatives and similar repositories for GTBench
Users that are interested in GTBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆61May 2, 2025Updated last year
- Code for paper "Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers"☆17Jan 27, 2023Updated 3 years ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆90Jul 21, 2024Updated last year
- Official repo for An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization☆16Mar 8, 2024Updated 2 years ago
- 2D road segmentation using lidar data during training☆43Dec 21, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…☆55Aug 27, 2025Updated 8 months ago
- ☆35Jan 23, 2024Updated 2 years ago
- [ECAI 2023] MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient☆32Dec 8, 2023Updated 2 years ago
- This repository contains the resource introduced in the paper: "Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis"…☆25Oct 15, 2025Updated 6 months ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- Official implementation of Inconsistency Masks. A robust semi-supervised segmentation framework that reframes model disagreement as a…☆19Jan 23, 2026Updated 3 months ago
- An LSTM model implemented by PyTorch to perform sentiment classification on the Stanford Sentiment Treebank (SST-5) dataset.☆12Sep 13, 2022Updated 3 years ago
- ☆21Jun 27, 2024Updated last year
- ☆11Jan 3, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- Simple program to manually caption your images (or any other file types) so you can use them for AI training☆37Mar 20, 2023Updated 3 years ago
- ☆11Oct 8, 2023Updated 2 years ago
- [3DV 2025] Learning Naturally Aggregated Appearance for Efficient 3D Editing☆33Feb 13, 2025Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆82Mar 18, 2024Updated 2 years ago
- PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNe…☆43May 20, 2024Updated last year
- VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT☆112Jan 13, 2026Updated 3 months ago
- Local LLM-based social network filter☆72Jan 31, 2024Updated 2 years ago
- ☆27Mar 25, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆127May 7, 2024Updated last year
- multimodal change detection☆48Sep 20, 2024Updated last year
- ☆49Jan 18, 2024Updated 2 years ago
- A simple python package to stretch audio files and change their speed☆12Feb 18, 2026Updated 2 months ago
- StrAttack, ICLR 2019☆33Aug 4, 2019Updated 6 years ago
- ☆25Dec 19, 2025Updated 4 months ago
- ☆55Apr 24, 2024Updated 2 years ago
- EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection☆82Apr 26, 2024Updated 2 years ago
- ☆99Jun 27, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Sep 24, 2023Updated 2 years ago
- ☆10Dec 14, 2020Updated 5 years ago
- [Nature Communications] The official codes for "Towards Building Multilingual Language Model for Medicine"☆280May 9, 2025Updated 11 months ago
- [ICML 2023] Are Diffusion Models Vulnerable to Membership Inference Attacks?☆43Sep 4, 2024Updated last year
- [ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models☆81Mar 6, 2026Updated 2 months ago
- Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning☆257Sep 26, 2024Updated last year
- Browser automation for creating new pages in WordPress☆13Jun 7, 2025Updated 10 months ago