facebookresearch / ToolVerifierLinks
This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.
☆22Updated last year
Alternatives and similar repositories for ToolVerifier
Users that are interested in ToolVerifier are comparing it to the libraries listed below
Sorting:
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated last year
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆55Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆95Updated 2 years ago
- ☆68Updated 2 years ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆49Updated last year
- Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding☆27Updated last year
- Supporting code for ReCEval paper☆30Updated last year
- ☆35Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆41Updated 2 years ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Updated last year
- Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.☆64Updated 11 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆51Updated 5 months ago
- [NAACL 2024] Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? https://aclanthology.org/2024.naa…☆55Updated 3 months ago
- Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"☆96Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆60Updated 9 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- [ACL 2023] Few-shot Reranking for Multi-hop QA via Language Model Prompting☆27Updated last month
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated 2 years ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆15Updated last year
- ☆14Updated last year
- A unified benchmark for math reasoning☆89Updated 2 years ago
- Transformers at any scale☆41Updated last year
- The data and the PyTorch implementation for the models and experiments in the paper "Language Model Decoding as Likelihood–Utility Alignm…☆14Updated 2 years ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 4 months ago
- Retrieval Augmented Generation Generalized Evaluation Dataset☆58Updated 4 months ago
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆64Updated last year
- Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs☆40Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated 2 years ago