interstellarninja / function-calling-evalLinks

A framework for evaluating function calls made by LLMs

☆37

Alternatives and similar repositories for function-calling-eval

Users that are interested in function-calling-eval are comparing it to the libraries listed below

Sorting:

Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆102Updated last year
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated last year
weaviate-tutorials / Hurricane
Writing Blog Posts with Generative Feedback Loops!
☆50Updated last year
teknium1 / transformers-gptq-quant
☆47Updated last year
zbambergerNLP / strategic-debate-tot
A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments
☆87Updated 10 months ago
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆101Updated last year
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 6 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆55Updated 5 months ago
teknium1 / ShareGPT-Builder
☆115Updated 7 months ago
deployradiant / pychatml
Chat Markup Language conversation library
☆55Updated last year
davanstrien / data-for-fine-tuning-llms
☆77Updated last year
QuixiAI / kraken
☆66Updated last year
Technoculture / personal-graph
Simple Graph Memory for AI applications
☆89Updated 2 months ago
h2oai / enterprise-h2ogpte
Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform
☆87Updated last month
BBischof / yapping
Verbosity control for AI agents
☆64Updated last year
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 8 months ago
redotvideo / pluto
Synthetic Data for LLM Fine-Tuning
☆120Updated last year
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆65Updated last year
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 8 months ago
QuixiAI / OpenChatML
☆157Updated last year
interstellarninja / MeeseeksAI
A framework for orchestrating AI agents using a mermaid graph
☆77Updated last year
PrithivirajDamodaran / blitz-embed
C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…
☆22Updated last year
QuixiAI / SystemChat
☆30Updated last year
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
geronimi73 / phi2-finetune
☆87Updated last year
migtissera / Sensei
Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI
☆222Updated last year
official-elinas / zeus-llm-trainer
Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
☆69Updated last year
QuixiAI / spectrum
☆128Updated 3 months ago
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆49Updated 5 months ago
rgbkrk / chatlab
⚡️🧪 Fast LLM Tool Calling Experimentation, big and smol
☆148Updated 10 months ago