(ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.01935
☆48Jun 21, 2025Updated 11 months ago
Alternatives and similar repositories for MARBLE
Users that are interested in MARBLE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆20Oct 6, 2025Updated 8 months ago
- A record of reading list on some MLsys popular topic☆25Mar 20, 2025Updated last year
- ☆21Jun 9, 2025Updated last year
- [VLDB 2025] BigVectorBench advances vector database benchmarking by defining and evaluating the embedding performance of heterogeneous da…☆32Jan 17, 2025Updated last year
- ☆32Jan 22, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆28Mar 14, 2024Updated 2 years ago
- ☆11Nov 8, 2023Updated 2 years ago
- ☆38Jun 28, 2025Updated 11 months ago
- ☆25Apr 19, 2024Updated 2 years ago
- Lock-free RCU (Read-Copy-Update) user-space library☆13Jan 3, 2026Updated 5 months ago
- Let Models Speak Ciphers: Multiagent Debate through Embeddings☆17Feb 17, 2024Updated 2 years ago
- Benchmark of LLMs on real open-source projects against dependency hell, legacy toolchains, and complex build systems.☆58Dec 23, 2025Updated 5 months ago
- ☆11Dec 19, 2023Updated 2 years ago
- Codebase for generation-time and post-hoc text watermarking, as well as watermark radioactivity detection.☆63May 19, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Use contrastive learning to train a large language model (LLM) as a retriever☆12Jul 19, 2024Updated last year
- Is Neuron Coverage a Meaningful Measure for Testing Deep Neural Networks? (FSE 2020)☆10Sep 23, 2021Updated 4 years ago
- code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…☆13Nov 17, 2024Updated last year
- An official PyTorch implementation of "Certifiably Robust Graph Contrastive Learning" (NeurIPS 2023)☆11Jan 22, 2024Updated 2 years ago
- Source code for the Information Sciences paper "Rumor Detection on Social Media through Mining the Social Circles with High Homogeneity"☆21Jun 10, 2023Updated 3 years ago
- Convert pretrained RoBerta models to various long-document transformer models☆11Apr 5, 2022Updated 4 years ago
- Official resources of "The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reaso…☆21Jun 13, 2025Updated 11 months ago
- ☆10Mar 4, 2024Updated 2 years ago
- Temporal-Dynamics Aware Adversarial Attacks on Discrete Time Dynamic Graph Models☆17Oct 19, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆19Jun 20, 2025Updated 11 months ago
- Now you can date a Zoom meeting with AI's help.☆14Jun 22, 2025Updated 11 months ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- CUDA implementation of Multidimensional Scaling☆15May 8, 2021Updated 5 years ago
- Code repository accompanying the Heuristic Guided RL NeurIPS'21 paper☆17Jan 3, 2022Updated 4 years ago
- A modified version of Andrej Karpathy's build-nanogpt☆36Oct 26, 2025Updated 7 months ago
- Sotopia-RL: Reward Design for Social Intelligence☆50Apr 1, 2026Updated 2 months ago
- ☆11Oct 11, 2023Updated 2 years ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆17May 27, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The Infibench variant of bigcode-evaluation-harness --- a framework for the evaluation of autoregressive code generation language models.☆14Oct 19, 2024Updated last year
- ☆13Sep 5, 2021Updated 4 years ago
- ☆97Mar 26, 2024Updated 2 years ago
- Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs☆12Nov 7, 2024Updated last year
- LLM as World Models using Bayesian inference☆20May 27, 2025Updated last year
- ☆15Oct 6, 2024Updated last year
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆86Sep 13, 2025Updated 8 months ago