andrew-silva / mlx-rlhfLinks
An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.
☆29Updated last year
Alternatives and similar repositories for mlx-rlhf
Users that are interested in mlx-rlhf are comparing it to the libraries listed below
Sorting:
- ☆62Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- look how they massacred my boy☆63Updated 8 months ago
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆27Updated 2 months ago
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆51Updated last year
- Train your own SOTA deductive reasoning model☆94Updated 3 months ago
- Score LLM pretraining data with classifiers☆55Updated last year
- ☆66Updated last year
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆80Updated last month
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Updated 9 months ago
- Simple repository for training small reasoning models☆31Updated 4 months ago
- ☆20Updated 3 months ago
- ☆38Updated 10 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year
- LLM reads a paper and produce a working prototype☆57Updated 2 months ago
- Approximating the joint distribution of language models via MCTS☆21Updated 7 months ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆40Updated 4 months ago
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- ☆17Updated 4 months ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 7 months ago
- ☆48Updated last year
- a version of baby agi using dspy and typed predictors☆17Updated last year
- Lego for GRPO☆28Updated 3 weeks ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 4 months ago
- Karpathy's llama2.c transpiled to MLX for Apple Silicon☆15Updated last year
- Using modal.com to process FineWeb-edu data☆20Updated 2 months ago