REALM-Bench: A Real-World Planning Benchmark for LLMs and Multi-Agent Systems
☆40Dec 31, 2025Updated 4 months ago
Alternatives and similar repositories for REALM-Bench
Users that are interested in REALM-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- My research paper notes, focusing on data mining/recommender/reinforcement learning. 我的论文笔记,主要聚焦于数据挖掘、推荐系统、强化学习☆24Dec 4, 2021Updated 4 years ago
- Code for "APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training"☆41Dec 23, 2025Updated 4 months ago
- Repository for the paper "MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance"☆25Feb 19, 2025Updated last year
- track golang trending in github☆22Updated this week
- Joint Optimization of Cascade Ranking Models (WSDM 19)☆13Jun 21, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆11Oct 9, 2021Updated 4 years ago
- Implementaion of the WWW paper Implicit User Awareness Modeling via Candidate Items for CTR Prediction in Search Ads☆18Apr 27, 2022Updated 4 years ago
- The pytorch implementation of paper: A Graph-Enhanced Click Model for Web Search☆15Nov 17, 2021Updated 4 years ago
- ☆82Mar 11, 2025Updated last year
- KAIST medical VL research group☆20Dec 20, 2024Updated last year
- unofficial implementation of the CoT-decoding method for extract cot paths in an unsupervised way☆20Jan 11, 2026Updated 3 months ago
- 非沪籍高校 毕业生留沪各项流程汇总☆17Jan 24, 2018Updated 8 years ago
- ☆31Apr 2, 2025Updated last year
- ☆15Apr 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [WWW'2025] "RTBAgent: A LLM-based Agent System for Real-Time Bidding"☆32Apr 14, 2025Updated last year
- [ NeurIPS '22 ] Data distillation for recommender systems. Shows equivalent performance with 2-3 orders less data.☆23Jun 8, 2023Updated 2 years ago
- SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution☆28Nov 11, 2025Updated 5 months ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- ☆16Feb 22, 2025Updated last year
- ☆41Nov 22, 2025Updated 5 months ago
- Langchain + Docker + Neo4j☆10Oct 29, 2024Updated last year
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors☆11Apr 14, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆35May 24, 2025Updated 11 months ago
- Experiments codes for RecSys '21 paper "Mitigating Confounding Bias in Recommendation via Information Bottleneck"☆19Apr 6, 2022Updated 4 years ago
- ☆13May 12, 2025Updated 11 months ago
- ☆21Apr 29, 2026Updated last week
- An updated version of eICU Benchmark with an updated problem definition on LoS and Decompensation tasks☆12Aug 12, 2021Updated 4 years ago
- 机器学习(Machine Learning)、深度学习(Deep Learning)、对抗神经网络(GAN),图神经网络(GNN),NLP,大数据相关的发展路书(roadmap), 并附海量源码(python,pytorch)带大家消化基本知识点,突破面试,完成从新手到合格…☆10Feb 25, 2020Updated 6 years ago
- ☆34Jul 4, 2025Updated 10 months ago
- Elevate your language models with insightful diversity metrics.☆11Feb 4, 2024Updated 2 years ago
- natural annotated text-category pairs for text classification☆10Sep 10, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Model to predict kinase-ligand pKi values.☆12Jul 6, 2023Updated 2 years ago
- ☆11Apr 8, 2022Updated 4 years ago
- OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors☆10Apr 14, 2022Updated 4 years ago
- This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…☆11Oct 9, 2024Updated last year
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆21Oct 28, 2025Updated 6 months ago
- [EMNLP 2024] FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents☆22Jan 6, 2025Updated last year
- ☆12Feb 2, 2024Updated 2 years ago