antirez / LLM-FTC-sampling
First token cutoff sampling inference example
☆30Updated last year
Alternatives and similar repositories for LLM-FTC-sampling:
Users that are interested in LLM-FTC-sampling are comparing it to the libraries listed below
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 7 months ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 7 months ago
- Using modal.com to process FineWeb-edu data☆20Updated last month
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- Because it's there.☆16Updated 7 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 7 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆25Updated 10 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 3 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆135Updated last week
- A collection of optimizers for MLX☆35Updated 2 weeks ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 4 months ago
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆14Updated 2 months ago
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆40Updated 2 months ago
- A python command-line tool to download & manage MLX AI models from Hugging Face.☆17Updated 8 months ago
- ☆13Updated last year
- LLama implementations benchmarking framework☆12Updated last year
- Training hybrid models for dummies.☆20Updated 3 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 4 months ago
- look how they massacred my boy☆63Updated 6 months ago
- A miniature version of Modal☆20Updated 10 months ago
- ☆35Updated this week
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆15Updated 2 years ago
- Thin wrapper around GGML to make life easier☆27Updated this week
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated last month
- TRITONCACHE implementation of a Redis cache☆13Updated 3 weeks ago
- Creating Generative AI Apps which work☆17Updated 3 weeks ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 5 months ago
- Latent Large Language Models☆18Updated 8 months ago
- ☆18Updated last month
- ☆48Updated last year