antirez / LLM-FTC-sampling
First token cutoff sampling inference example
☆28Updated 8 months ago
Related projects: ⓘ
- LLama implementations benchmarking framework☆10Updated 10 months ago
- A super simple web interface to perform blind tests on LLM outputs.☆24Updated 6 months ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆45Updated 2 months ago
- Run embedding models using ONNX☆23Updated 7 months ago
- ☆10Updated 2 weeks ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆21Updated 2 months ago
- Binary vector search example using Unum's USearch engine and pre-computed Wikipedia embeddings from Co:here and MixedBread☆18Updated 5 months ago
- Like picoGPT but for BERT.☆50Updated last year
- A Learning Journey: Micrograd in Mojo 🔥☆57Updated 3 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆23Updated last week
- Public reports detailing responses to sets of prompts by Large Language Models.☆25Updated 11 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆51Updated 7 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆40Updated 2 weeks ago
- ☆25Updated this week
- 1.58 Bit LLM on Apple Silicon using MLX☆97Updated 4 months ago
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆13Updated 8 months ago
- Github repo for Peifeng's internship project☆12Updated 10 months ago
- Voyage AI Official Python Library☆37Updated 3 months ago
- Using modal.com to process FineWeb-edu data☆18Updated 2 weeks ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA☆35Updated 6 months ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆44Updated 10 months ago
- Embedding models from Jina AI☆55Updated 8 months ago
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆51Updated 5 months ago
- run embeddings in MLX☆68Updated last month
- QLLM: A powerful CLI for seamless interaction with multiple Large Language Models. Simplify AI workflows, streamline development, and unl…☆23Updated this week
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆26Updated last year
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆60Updated last month
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated 8 months ago
- ☆40Updated 2 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆117Updated 8 months ago