smpanaro / ModernBERT-AppleNeuralEngineLinks
ModernBERT model optimized for Apple Neural Engine.
☆30Updated last year
Alternatives and similar repositories for ModernBERT-AppleNeuralEngine
Users that are interested in ModernBERT-AppleNeuralEngine are comparing it to the libraries listed below
Sorting:
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Updated last year
- run embeddings in MLX☆97Updated last year
- Profile your CoreML models directly from Python 🐍☆30Updated 5 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆85Updated 5 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆273Updated this week
- Find out why your CoreML model isn't running on the Neural Engine!☆30Updated last year
- Fast parallel LLM inference for MLX☆246Updated last year
- ☆15Updated last year
- ☆129Updated 7 months ago
- mlx implementations of various transformers, speedups, training☆33Updated 2 years ago
- MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…☆72Updated last year
- ☆27Updated last year
- MLX support for the Open Neural Network Exchange (ONNX)☆63Updated last year
- Start a server from the MLX library.☆198Updated last year
- 1.58 Bit LLM on Apple Silicon using MLX☆243Updated last year
- ☆219Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆100Updated 7 months ago
- Implementation of nougat that focuses on processing pdf locally.☆84Updated last year
- ☆20Updated 3 weeks ago
- ☆68Updated last year
- C API for MLX☆174Updated this week
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆117Updated 2 years ago
- mlx image models for Apple Silicon machines☆91Updated 2 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆156Updated 6 months ago
- MoE training for Me and You and maybe other people☆353Updated this week
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆59Updated 2 years ago
- ☆137Updated last year
- FlashAttention (Metal Port)☆579Updated last year
- For inferring and serving local LLMs using the MLX framework☆110Updated last year
- look how they massacred my boy☆63Updated last year