NVIDIA/When2Call

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVIDIA/When2Call)

NVIDIA / When2Call

A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions

☆67

Alternatives and similar repositories for When2Call

Users that are interested in When2Call are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

THUNLP-MT / StableToolBench
View on GitHub
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
☆237Apr 15, 2025Updated last year
yuyq18 / StepTool
View on GitHub
☆36May 24, 2025Updated last year
Skripkon / piano-music-generator
View on GitHub
My ugly yet simple code for generating music!
☆13Jan 16, 2024Updated 2 years ago
tokaevaAA / Teaching
View on GitHub
☆16May 31, 2025Updated last year
stefan-it / ukrainian-electra
View on GitHub
Ukrainian ELECTRA model
☆12Mar 11, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
ServiceNow / AgentAda
View on GitHub
Agent ADA is a comprehensive evaluation and data analytics framework focused on insights generation and skills assessment.
☆15Aug 19, 2025Updated 11 months ago
oslook / mcp-servers-schemas
View on GitHub
This document provides a list of different MCP servers. For each server, we provide a schema definition that includes the latest basic in…
☆16Jul 2, 2026Updated 3 weeks ago
yxzwang / FamilyTool
View on GitHub
FamilyTool benchmark
☆14Sep 10, 2025Updated 10 months ago
ChengpengLi1003 / Awesome-Long-Chain-of-Thought-Reasoning-with-tools
View on GitHub
A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.
☆46Dec 17, 2025Updated 7 months ago
IBM / API-BLEND
View on GitHub
Companion code to https://arxiv.org/abs/2402.15491
☆22Sep 18, 2025Updated 10 months ago
benediktstroebl / agent-evals
View on GitHub
☆27May 28, 2025Updated last year
efarrell1 / train_sparse_autoencoder
View on GitHub
Trains Sparse Autoencoders based on outputs from language models
☆11Oct 7, 2024Updated last year
kenkenpa2126 / vanilla_transformer_from_scratch_with_JAX
View on GitHub
☆10Dec 18, 2023Updated 2 years ago
HowieHwong / MetaTool
View on GitHub
[ICLR'24] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
☆115Mar 21, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
smilegate-ai / OPELA
View on GitHub
☆29Nov 23, 2022Updated 3 years ago
HarryMayne / qwen_3_chat_templates
View on GitHub
Alternative chat templates for Qwen 3 8B. Useful for multi-turn RL
☆15Sep 4, 2025Updated 10 months ago
Callione / LLaVA-MOSS2
View on GitHub
Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.
☆13Sep 19, 2024Updated last year
HuanzhiMao / BFCL-Result
View on GitHub
Public Evaluation Result Archieve for BFCL
☆30Dec 17, 2025Updated 7 months ago
hucsmn / suffix_array
View on GitHub
suffix array construction and searching algorithms for in-memory binary data.
☆13Sep 10, 2022Updated 3 years ago
MTU-Bench-Team / MTU-Bench
View on GitHub
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
☆60Jul 24, 2025Updated last year
ddhruvkr / CONTRADOC
View on GitHub
☆13Feb 8, 2025Updated last year
google-research-datasets / QAmeleon
View on GitHub
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…
☆34Aug 15, 2023Updated 2 years ago
spAurora / RushSeat-UI
View on GitHub
武大信图抢座程序支持后台持续监测，抢靠窗、有电脑的座位以及抢座成功后自动关机
☆15Dec 8, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
swei2001 / RouteSAEs
View on GitHub
☆15Jan 2, 2026Updated 6 months ago
CSSLab / ThinkTwice
View on GitHub
Jointly Optimizing Large Language Models for Reasoning and Self-Refinement
☆15Apr 22, 2026Updated 3 months ago
chawyehsu / wende
View on GitHub
🍎Wende Chinese QA system (experimental)
☆10Jun 1, 2021Updated 5 years ago
junekihong / beam-span-parser
View on GitHub
A DP beam-search extension of Mitchell Stern's span-based neural constituency parser
☆11Aug 24, 2022Updated 3 years ago
dxlong2000 / FormatBiasEval
View on GitHub
Official codes for NAACL 2025 paper "LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias …
☆11Nov 25, 2025Updated 7 months ago
UmeanNever / RankSurprisalRatio
View on GitHub
[ACL 2026 Main] Official Repo for Paper "Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Ali…
☆17Jul 1, 2026Updated 3 weeks ago
epfl-dlab / forc
View on GitHub
Framework for Cost-Effective Language Model Choice
☆16Dec 12, 2023Updated 2 years ago
tdoly / The-Art-Of-Programming-by-July
View on GitHub
本项目是July的《程序员编程艺术》的电子书版本
☆10Jan 9, 2014Updated 12 years ago
icip-cas / LiteCoder
View on GitHub
Advancing Small and Medium-sized Code Agents.
☆17May 29, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yanqiangmiffy / tree2retriever
View on GitHub
Recursive Abstractive Processing for Tree-Organized Retrieval
☆10May 30, 2024Updated 2 years ago
megagonlabs / cocosum
View on GitHub
Code & Data for Comparative Opinion Summarization via Collaborative Decoding (Iso et al; Findings of ACL 2022)
☆23Mar 3, 2025Updated last year
XiaoduoAILab / XmodelLM
View on GitHub
XmodelLM
☆38Nov 19, 2024Updated last year
Chiu-Te-Wang / Paper-Reading
View on GitHub
Paper Reading Summary(mainly NLP related papers)
☆11Nov 6, 2019Updated 6 years ago
Fu-Dayuan / AgentRefine
View on GitHub
(ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuning
☆20Nov 22, 2025Updated 8 months ago
xiaofei05 / TSST
View on GitHub
Code for EMNLP2021 paper “Transductive Learning for Unsupervised Text Style Transfer”
☆12Sep 19, 2021Updated 4 years ago
genlm / genlm-backend
View on GitHub
High-performance backend for language model probabilistic programs
☆17Jun 29, 2026Updated 3 weeks ago