snuhcc/DICE-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/snuhcc/DICE-Bench)

snuhcc / DICE-Bench

[ACL 2025] DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues

☆26

Alternatives and similar repositories for DICE-Bench

Users that are interested in DICE-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kc-ml2 / MARU-Lang
View on GitHub
MARU-Lang is an open-source RAG chatbot engine.
☆27Updated this week
chanmuzi / NLP-Paper-News
View on GitHub
The list of NLP paper and news I've checked. There might be short description of them (abstract) in Korean.
☆38Updated this week
Minju-nimm / MIT_PJT
View on GitHub
어린이를 위한 동화 제작 서비스, My AI Fairy-Tale
☆11Apr 7, 2023Updated 3 years ago
seraaaayeo / SellyDev
View on GitHub
Autonomous-driving delivery robot project : Selly
☆10Jul 11, 2020Updated 6 years ago
Marker-Inc-Korea / KoLLM_Eval
View on GitHub
한국어 벤치마크 평가 코드 통합본(?)
☆21Nov 15, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yubol-bobo / MT-Consistency
View on GitHub
This repo investigates LLMs' tendency to exhibit acquiescence bias in sequential QA interactions. Includes evaluation methods, datasets, …
☆17Apr 24, 2026Updated 3 months ago
dnotitia / smoothie-qwen
View on GitHub
A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.
☆106Jul 9, 2025Updated last year
sjchoi86 / yet-another-pytorch-tutorial
View on GitHub
Yet Another PyTorch Tutorial
☆12Jan 18, 2021Updated 5 years ago
tigerchen52 / LOVE
View on GitHub
ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost
☆41Nov 15, 2023Updated 2 years ago
hiddenmaze / InteractivePickup
View on GitHub
Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
☆11Sep 28, 2018Updated 7 years ago
jeon185 / LaViC
View on GitHub
Implementation of LaViC (KDD 2025)
☆13Jun 1, 2025Updated last year
sjchoi86 / bbopt
View on GitHub
Black Box Optimization Methods
☆14Jun 8, 2020Updated 6 years ago
kakao / kanana-2
View on GitHub
☆23Jun 30, 2026Updated 3 weeks ago
KanghoonYoon / torch-rasgg
View on GitHub
This is anonymous repository for submitting our work to a conference
☆14Dec 17, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
MrBananaHuman / PangyoCorpora
View on GitHub
☆38Oct 4, 2023Updated 2 years ago
zlxxlz1026 / CSHI
View on GitHub
☆15Jun 18, 2024Updated 2 years ago
sjchoi86 / simple-mujoco-usage-v2
View on GitHub
☆13Sep 12, 2022Updated 3 years ago
sjchoi86 / upstage-basic-deeplearning
View on GitHub
PyTorch Tutorial for Boostcamp AI Tech
☆52Sep 4, 2022Updated 3 years ago
larrydiamond / AICodeReviewer
View on GitHub
AI Code Reviews
☆18Nov 22, 2025Updated 8 months ago
LG-AI-EXAONE / KoMT-Bench
View on GitHub
Official repository for KoMT-Bench built by LG AI Research
☆73Aug 8, 2024Updated last year
krishnamrith12 / ProoFVer
View on GitHub
Proof system for Fact Verification
☆14Jun 7, 2022Updated 4 years ago
GAIR-NLP / AgencyBench
View on GitHub
[ACL2026 Main] AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
☆90Jan 23, 2026Updated 6 months ago
JAX-KR / jax-flax-book
View on GitHub
☆10Sep 13, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Ravoxsg / efficient_unified_crs
View on GitHub
Source code for PECRS (EACL 2024)
☆12Feb 3, 2024Updated 2 years ago
disi-unibo-nlp / medgenie
View on GitHub
The First Generate-then-Read Framework for Multiple-Choice Question Answering in Medicine
☆15May 27, 2024Updated 2 years ago
ozzafar / count_token_optimization
View on GitHub
☆16Sep 6, 2024Updated last year
heqin-zhu / UOD_universal_oneshot_detection
View on GitHub
[MICCAI 2023] (early accept) UOD: universal oneshot detection of anatomical landmarks. https://arxiv.org/abs/2306.07615
☆12Jan 4, 2024Updated 2 years ago
davanstrien / huggingface-tldr
View on GitHub
Experimental tl;dr summaries for datasets on the Hugging Face Hub!
☆10Apr 4, 2024Updated 2 years ago
RaghuHemadri / Reinforcement-Learning-Reading-List
View on GitHub
☆11Jul 14, 2021Updated 5 years ago
thu-coai / NAST
View on GitHub
Codes for "NAST: A Non-Autoregressive Generator with Word Alignment for Unsupervised Text Style Transfer" (ACL 2021 findings)
☆15Nov 3, 2021Updated 4 years ago
sail-sg / AnytimeReasoner
View on GitHub
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆54Jul 15, 2025Updated last year
wikibook / openai-llm
View on GitHub
《GPT-4, ChatGPT, 라마인덱스, 랭체인을 활용한 인공지능 프로그래밍》 예제 코드
☆10Jan 16, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
roi-hpi / IDK-token-tuning
View on GitHub
☆16Jul 17, 2025Updated last year
sangwu99 / Simplot
View on GitHub
The official source code for SIMPLOT: Enhancing Chart Question Answering by Distilling Essentials, accepted at NAACL 2025 (Findings).
☆19Feb 4, 2025Updated last year
YoungDubbyDu / LLM-based-Multi-Agent-Systems
View on GitHub
这是对基于大模型的多智能体系统论文的总结
☆10Jun 23, 2024Updated 2 years ago
wzf2000 / RecLLMSim
View on GitHub
A dataset for task-based recommendation conversation.
☆16Jul 13, 2026Updated 2 weeks ago
hrwise-nlp / AppBench
View on GitHub
This is for EMNLP 2024 Paper: AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction
☆16Nov 4, 2024Updated last year
sutdcv / Chaotic-World
View on GitHub
[ICCV2023] Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events
☆10Dec 7, 2024Updated last year
PEBpung / MLOps-Tutorial
View on GitHub
Pytorch를 활용한 WandB의 Sweeps 🧹
☆15Dec 24, 2022Updated 3 years ago