antirez / LLM-FTC-sampling
First token cutoff sampling inference example
☆29Updated last year
Alternatives and similar repositories for LLM-FTC-sampling:
Users that are interested in LLM-FTC-sampling are comparing it to the libraries listed below
- Using modal.com to process FineWeb-edu data☆20Updated 2 weeks ago
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆14Updated 2 years ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 6 months ago
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 6 months ago
- Run Llama 2 using MLX on macOS☆33Updated last year
- llm plugin for Cerebras fast inference API☆24Updated 2 weeks ago
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆14Updated last month
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- ☆15Updated last year
- Run LLMs on Replicate with vLLM☆16Updated 5 months ago
- ☆38Updated last year
- Because it's there.☆15Updated 6 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 3 months ago
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- LLama implementations benchmarking framework☆12Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆128Updated 2 weeks ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆75Updated 8 months ago
- Embedding models from Jina AI☆58Updated last year
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆72Updated 2 weeks ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA☆36Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 2 months ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Updated last year
- ☆12Updated last year
- Training hybrid models for dummies.☆20Updated 2 months ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆37Updated last month
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 11 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆25Updated 9 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 6 months ago
- ANE accelerated embedding models!☆17Updated 3 months ago