[ACL 2025] DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues
☆26Jul 10, 2025Updated 11 months ago
Alternatives and similar repositories for DICE-Bench
Users that are interested in DICE-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The list of NLP paper and news I've checked. There might be short description of them (abstract) in Korean.☆37Jun 23, 2026Updated last week
- MARU-Lang is an open-source RAG chatbot engine.☆27Jun 26, 2026Updated last week
- 어린이를 위한 동화 제작 서비스, My AI Fairy-Tale☆11Apr 7, 2023Updated 3 years ago
- Generative Visual Code Mobile World Model☆60May 15, 2026Updated last month
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage☆16Sep 2, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 한국어 벤치마크 평가 코드 통합본(?)☆21Nov 15, 2024Updated last year
- ☆17Nov 18, 2024Updated last year
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆106Jul 9, 2025Updated 11 months ago
- This repo investigates LLMs' tendency to exhibit acquiescence bias in sequential QA interactions. Includes evaluation methods, datasets, …☆17Apr 24, 2026Updated 2 months ago
- ☆14Jun 11, 2024Updated 2 years ago
- Autonomous-driving delivery robot project : Selly☆10Jul 11, 2020Updated 5 years ago
- huggingface transformers tutorial, code, resources☆26Apr 7, 2024Updated 2 years ago
- Zephyr RTOS based Vehicle Management Unit☆39Jun 27, 2026Updated last week
- Yet Another PyTorch Tutorial☆12Jan 18, 2021Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost☆42Nov 15, 2023Updated 2 years ago
- ☆16Jul 17, 2025Updated 11 months ago
- 🏡Java 언어로 배우는 디자인 패턴 입문☆14Dec 8, 2020Updated 5 years ago
- Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration☆11Sep 28, 2018Updated 7 years ago
- langchain opentutorial utility package for Open Tutorial☆10Feb 2, 2025Updated last year
- Black Box Optimization Methods☆14Jun 8, 2020Updated 6 years ago
- ☆36Oct 4, 2023Updated 2 years ago
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- AI Code Reviews☆18Nov 22, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆13Sep 12, 2022Updated 3 years ago
- Official repository for KoMT-Bench built by LG AI Research☆73Aug 8, 2024Updated last year
- Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos (WACV 2023)☆54Oct 29, 2023Updated 2 years ago
- PyTorch Tutorial for Boostcamp AI Tech☆52Sep 4, 2022Updated 3 years ago
- Proof system for Fact Verification☆14Jun 7, 2022Updated 4 years ago
- ☆10Sep 13, 2024Updated last year
- Reasoning-based Evaluation and Ranking of Translations.☆19Jun 2, 2026Updated last month
- Experimental tl;dr summaries for datasets on the Hugging Face Hub!☆10Apr 4, 2024Updated 2 years ago
- Pytorch를 활용한 WandB의 Sweeps 🧹☆15Dec 24, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- AutoRAG example about benchmarking Korean embeddings.☆45Oct 2, 2024Updated last year
- ☆21Jun 1, 2026Updated last month
- ☆50Apr 20, 2026Updated 2 months ago
- Codes for "NAST: A Non-Autoregressive Generator with Word Alignment for Unsupervised Text Style Transfer" (ACL 2021 findings)☆15Nov 3, 2021Updated 4 years ago
- Forked repo from https://github.com/EleutherAI/lm-evaluation-harness/commit/1f66adc☆81Feb 28, 2024Updated 2 years ago
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 3 years ago
- The official implementation of "ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering"☆66Jun 21, 2025Updated last year