antirez / LLM-FTC-samplingLinks
First token cutoff sampling inference example
☆31Updated last year
Alternatives and similar repositories for LLM-FTC-sampling
Users that are interested in LLM-FTC-sampling are comparing it to the libraries listed below
Sorting:
- Training hybrid models for dummies.☆29Updated 3 weeks ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆89Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆11Updated last year
- Pivotal Token Search☆131Updated 4 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆53Updated 3 months ago
- Transformer GPU VRAM estimator☆66Updated last year
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆213Updated last week
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- ☆21Updated last year
- A Learning Journey: Micrograd in Mojo 🔥☆63Updated last year
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆16Updated 3 years ago
- Because it's there.☆16Updated last year
- C API for MLX☆151Updated this week
- Using modal.com to process FineWeb-edu data☆20Updated 7 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆197Updated 2 months ago
- Chat Markup Language conversation library☆55Updated last year
- Simple high-throughput inference library☆149Updated 6 months ago
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆58Updated last year
- Thin wrapper around GGML to make life easier☆40Updated 2 weeks ago
- Your buddy in the (L)LM space.☆64Updated last year
- Vector Database with support for late interaction and token level embeddings.☆55Updated 5 months ago
- A super simple web interface to perform blind tests on LLM outputs.☆29Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆58Updated last month
- ☆45Updated 2 years ago
- Implementation of nougat that focuses on processing pdf locally.☆83Updated 10 months ago
- Inference of Mamba models in pure C☆192Updated last year
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆58Updated last year
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆14Updated 9 months ago
- look how they massacred my boy☆63Updated last year
- ☆52Updated last year