The evaluation code for MultiIF multi-turn and multi-lingual instruction following
☆60Oct 29, 2024Updated last year
Alternatives and similar repositories for Multi-IF
Users that are interested in Multi-IF are comparing it to the libraries listed below
Sorting:
- Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.☆21Apr 3, 2025Updated 10 months ago
- ☆12Jul 10, 2023Updated 2 years ago
- Official Code for M-RᴇᴡᴀʀᴅBᴇɴᴄʜ: Evaluating Reward Models in Multilingual Settings (ACL 2025 Main)☆40May 16, 2025Updated 9 months ago
- Code for AAAI 2023 research track paper "Question Decomposition Tree for Answering Complex Questions over Knowledge Bases"☆18Jan 3, 2024Updated 2 years ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMs☆48Aug 26, 2024Updated last year
- [ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models☆119Jun 12, 2025Updated 8 months ago
- Code release for "Understanding Bias in Large-Scale Visual Datasets"☆22Dec 4, 2024Updated last year
- 🦫 BEAVER: An Enterprise Benchmark for Text-to-SQL☆26May 23, 2025Updated 9 months ago
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings☆65Feb 3, 2025Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆102Feb 20, 2025Updated last year
- ☆27Mar 21, 2024Updated last year
- A simple lightweight Model Context Protocol (MCP) server integration framework☆17Jan 23, 2026Updated last month
- Structured TRIZ prompt engineering for LLMs in an open, portable XML format – MIT licensed.☆14Nov 11, 2025Updated 3 months ago
- AuraMatrix is personality analysis web which using llm to do evaluation. I have made this for Gyanotsav-2025 to show different ways to ut…☆11Dec 22, 2025Updated 2 months ago
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- ☆33Aug 30, 2023Updated 2 years ago
- CoachLint is your AI coding coach. It guides you through errors instead of just solving them for you.☆23Nov 20, 2025Updated 3 months ago
- ☆13Jun 18, 2025Updated 8 months ago
- Glitch Gremlin AI☆15Apr 5, 2025Updated 10 months ago
- MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces☆10Mar 24, 2025Updated 11 months ago
- VibEx (vx) is a developer-friendly CLI tool that streamlines the process of working with AI coding assistants. It helps developers prepar…☆28May 17, 2025Updated 9 months ago
- A discord bot to interact with claude code on your personal projects☆70Jun 16, 2025Updated 8 months ago
- The official repository of the Omni-MATH benchmark.☆93Dec 22, 2024Updated last year
- Fastened CROWN: Tightened Neural Network Robustness Certificates☆10Feb 10, 2020Updated 6 years ago
- Shakey OS Mobile AI Framework for React Native allowing people to build React Native apps for IOS and Android with AI tooling and wallet …☆28Feb 3, 2025Updated last year
- React Native, Right Now (rn-rn)☆18Sep 2, 2025Updated 6 months ago
- A Discord bot to retrieve Shopify Orders and Statistics☆10Dec 9, 2025Updated 2 months ago
- "Open-source toolkit (Python Library, Registry API, CLI) for secure, decentralized AI agent interoperability using A2A/MCP."☆14May 10, 2025Updated 9 months ago
- An open source deep research clone. AI Agent (Local LLM or Gemini) that reasons large amounts of web data extracted with SwiftSoup.☆13Feb 10, 2025Updated last year
- 💀 gigasmol: a lightweight wrapper for gigachat api model for seamless use with smolagents.☆15Oct 23, 2025Updated 4 months ago
- 📱 A template for your next React Native project: Expo, TypeScript, ReStyle, Husky, react-navigation, react-query, react-hook-form, zusta…☆16Dec 15, 2025Updated 2 months ago
- SYSTEM PROMPT TRANSPARENCY FOR ALL☆12May 22, 2025Updated 9 months ago
- A powerful AI prompt engineering tool that transforms simple instructions into detailed, context-rich prompts using Google's Gemini Pro t…☆15Aug 28, 2025Updated 6 months ago
- AI Tasks. A LLM integrated agent orchestration tool for Liferay.☆14May 16, 2025Updated 9 months ago
- IBM watsonx Code Assistant for Red Hat Ansible Lightspeed demystifies the process of Ansible Playbook creation through generative AI-powe…☆19Sep 18, 2025Updated 5 months ago
- ☆16Jun 25, 2025Updated 8 months ago
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- Automatic stabilizing and auto-piloting system for RC flying wing☆14Mar 3, 2016Updated 9 years ago