antirez / LLM-FTC-samplingLinks
First token cutoff sampling inference example
☆30Updated last year
Alternatives and similar repositories for LLM-FTC-sampling
Users that are interested in LLM-FTC-sampling are comparing it to the libraries listed below
Sorting:
- Transformer GPU VRAM estimator☆66Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆157Updated 2 months ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆87Updated last year
- LLama implementations benchmarking framework☆12Updated last year
- Using modal.com to process FineWeb-edu data☆20Updated 3 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- ☆38Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- Because it's there.☆16Updated 9 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆27Updated 8 months ago
- llama.cpp gguf file parser for javascript☆43Updated 7 months ago
- Pivotal Token Search☆109Updated this week
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- Simple high-throughput inference library☆120Updated 2 months ago
- GGUF implementation in C as a library and a tools CLI program☆274Updated 6 months ago
- A Learning Journey: Micrograd in Mojo 🔥☆61Updated 9 months ago
- Implementation of nougat that focuses on processing pdf locally.☆81Updated 6 months ago
- A collection of optimizers for MLX☆36Updated last month
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 9 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆108Updated 3 months ago
- C API for MLX☆117Updated this week
- look how they massacred my boy☆63Updated 9 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆32Updated last year
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆58Updated last year
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆49Updated 2 weeks ago
- The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers☆29Updated last month
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 9 months ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 9 months ago
- Run Llama 2 using MLX on macOS☆34Updated last year
- QLLM: A powerful CLI for seamless interaction with multiple Large Language Models. Simplify AI workflows, streamline development, and unl…☆33Updated 3 months ago