zhuzilin / faster-nougat
Implementation of nougat that focuses on processing pdf locally.
☆81Updated 3 months ago
Alternatives and similar repositories for faster-nougat:
Users that are interested in faster-nougat are comparing it to the libraries listed below
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆49Updated last year
- ☆82Updated 3 months ago
- Distributed Inference for mlx LLm☆91Updated 9 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆78Updated 5 months ago
- Scripts to create your own moe models using mlx☆89Updated last year
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆57Updated last year
- Chat Markup Language conversation library☆55Updated last year
- One Line To Build Zero-Data Classifiers in Minutes☆53Updated 7 months ago
- run embeddings in MLX☆88Updated 7 months ago
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆40Updated 2 months ago
- ☆113Updated 4 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Updated 8 months ago
- A framework for evaluating function calls made by LLMs☆37Updated 9 months ago
- For inferring and serving local LLMs using the MLX framework☆103Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆52Updated 3 months ago
- ☆38Updated last year
- Transcribe and summarize videos using whisper and llms on apple mlx framework☆74Updated last year
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆26Updated 10 months ago
- Fast parallel LLM inference for MLX☆187Updated 10 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆87Updated 2 weeks ago
- huggingface chat-ui integration with mlx-lm server☆60Updated last year
- ☆156Updated 9 months ago
- inference code for mixtral-8x7b-32kseqlen☆100Updated last year
- ☆19Updated 9 months ago
- KAN (Kolmogorov–Arnold Networks) in the MLX framework for Apple Silicon☆16Updated last week
- auto fine tune of models with synthetic data☆75Updated last year
- ☆73Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 4 months ago
- LLM inference in C/C++☆76Updated this week