[ACL 2025] DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues
☆26Jul 10, 2025Updated 8 months ago
Alternatives and similar repositories for DICE-Bench
Users that are interested in DICE-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The list of NLP paper and news I've checked. There might be short description of them (abstract) in Korean.☆37Updated this week
- MARU-Lang is an open-source RAG chatbot engine.☆27Feb 2, 2026Updated last month
- 어린이를 위한 동화 제작 서비스, My AI Fairy-Tale☆11Apr 7, 2023Updated 2 years ago
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage☆16Sep 2, 2025Updated 6 months ago
- 한국어 벤치마크 평가 코드 통합본(?)☆20Nov 15, 2024Updated last year
- ☆17Nov 18, 2024Updated last year
- This repo investigates LLMs' tendency to exhibit acquiescence bias in sequential QA interactions. Includes evaluation methods, datasets, …☆39Sep 23, 2025Updated 6 months ago
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆104Jul 9, 2025Updated 8 months ago
- Autonomous-driving delivery robot project : Selly☆10Jul 11, 2020Updated 5 years ago
- ☆14Jun 11, 2024Updated last year
- huggingface transformers tutorial, code, resources☆26Apr 7, 2024Updated last year
- Zephyr RTOS based Vehicle Management Unit☆33Mar 10, 2026Updated 2 weeks ago
- Yet Another PyTorch Tutorial☆12Jan 18, 2021Updated 5 years ago
- ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost☆42Nov 15, 2023Updated 2 years ago
- ☆16Jul 17, 2025Updated 8 months ago
- 🏡Java 언어로 배우는 디자인 패턴 입문☆14Dec 8, 2020Updated 5 years ago
- Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration☆11Sep 28, 2018Updated 7 years ago
- ☆36Oct 4, 2023Updated 2 years ago
- Black Box Optimization Methods☆14Jun 8, 2020Updated 5 years ago
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- AI Code Reviews☆17Nov 22, 2025Updated 4 months ago
- ☆13Sep 12, 2022Updated 3 years ago
- Official repository for KoMT-Bench built by LG AI Research☆71Aug 8, 2024Updated last year
- Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos (WACV 2023)☆54Oct 29, 2023Updated 2 years ago
- PyTorch Tutorial for Boostcamp AI Tech☆52Sep 4, 2022Updated 3 years ago
- Proof system for Fact Verification☆15Jun 7, 2022Updated 3 years ago
- ☆10Sep 13, 2024Updated last year
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 8 months ago
- Experimental tl;dr summaries for datasets on the Hugging Face Hub!☆10Apr 4, 2024Updated last year
- AutoRAG example about benchmarking Korean embeddings.☆43Oct 2, 2024Updated last year
- Pytorch를 활용한 WandB의 Sweeps 🧹☆15Dec 24, 2022Updated 3 years ago
- oh-my-opencode patterns ported to OpenClaw☆115Mar 2, 2026Updated 3 weeks ago
- ☆18Updated this week
- ☆46Aug 28, 2025Updated 6 months ago
- Codes for "NAST: A Non-Autoregressive Generator with Word Alignment for Unsupervised Text Style Transfer" (ACL 2021 findings)☆15Nov 3, 2021Updated 4 years ago
- The official implementation of "ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering"☆57Jun 21, 2025Updated 9 months ago
- Forked repo from https://github.com/EleutherAI/lm-evaluation-harness/commit/1f66adc☆82Feb 28, 2024Updated 2 years ago
- ☆51Apr 14, 2025Updated 11 months ago
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 2 years ago