firstbatchxyz / function-calling-eval
The DPAB-α Benchmark
☆20Updated 3 months ago
Alternatives and similar repositories for function-calling-eval:
Users that are interested in function-calling-eval are comparing it to the libraries listed below
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 4 months ago
- ☆15Updated 10 months ago
- ☆22Updated 2 months ago
- ☆53Updated 10 months ago
- ☆80Updated 3 months ago
- ☆12Updated 7 months ago
- ☆23Updated 5 months ago
- Distributed Inference for mlx LLm☆87Updated 8 months ago
- Run AI generated code in isolated sandboxes☆54Updated 2 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 5 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆77Updated 4 months ago
- Example implementation of Iteration of Tought - Gives a star if you like the project☆40Updated 4 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆59Updated 11 months ago
- Simple examples using Argilla tools to build AI☆52Updated 5 months ago
- Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…☆42Updated 7 months ago
- ☆29Updated 4 months ago
- Code for paper https://arxiv.org/abs/2501.00522☆12Updated 2 months ago
- ☆153Updated 9 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- Scripts to create your own moe models using mlx☆89Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆87Updated this week
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated 11 months ago
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆39Updated 2 months ago
- Complex RAG backend☆28Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆60Updated 8 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆60Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆56Updated 2 months ago
- Proxy server that converts Anthropic API requests to OpenAI format and sends it to OpenRouter. It's used to use Claude Code with OpenRout…☆60Updated last month