TheProxyCompany / proxy-inference-engineLinks

Optimized LLM inference for Apple Silicon using MLX.

☆11

Alternatives and similar repositories for proxy-inference-engine

Users that are interested in proxy-inference-engine are comparing it to the libraries listed below

Sorting:

VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆66Updated last month
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆100Updated 3 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? we'll see.
☆37Updated this week
joey00072 / Attention-as-graph
alternative way to calculating self attention
☆18Updated last year
fal-ai-community / llmdifftracker
Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)
☆34Updated 3 months ago
tyler-romero / microR1
Simple repository for training small reasoning models
☆31Updated 4 months ago
HammingHQ / bug-in-the-code-stack
A new benchmark for measuring LLM's capability to detect bugs in large codebase.
☆30Updated last year
attentionmech / tensorlens
aesthetic tensor visualiser
☆22Updated last month
xjdr-alt / llmri
look how they massacred my boy
☆63Updated 7 months ago
N8python / mlx-pretrain
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆76Updated last month
apehex / tokun
Tokun to can tokens
☆17Updated this week
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆58Updated 2 weeks ago
joey00072 / Multi-Head-Latent-Attention-MLA-
working implimention of deepseek MLA
☆42Updated 4 months ago
kmohan321 / Research_Papers
☆46Updated 2 months ago
okarthikb / attention-visualizer
LLM attention pattern visualizer
☆10Updated last year
okarthikb / state-space-models
☆27Updated 10 months ago
brendanhogan / picoDeepResearch
☆59Updated 2 weeks ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆53Updated 4 months ago
fal-ai / diffusion-speedrun
Focused on fast experimentation and simplicity
☆73Updated 5 months ago
kubernetes-bad / reward-composer
Lego for GRPO
☆28Updated last week
haizelabs / j1-micro
j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
☆74Updated last week
thomasnormal / fewshot
☆28Updated 8 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆93Updated 3 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆85Updated last month
Aleph-Alpha-Research / scaling
Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…
☆62Updated 7 months ago
SonicCodes / lucid-v1
realtime latent world model inference demo
☆46Updated 6 months ago
enjalot / latent-data-modal
Using modal.com to process FineWeb-edu data
☆20Updated 2 months ago
evanatyourservice / llm-jax
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆17Updated 2 months ago
kyegomez / OpenStrawberry
An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO
☆29Updated this week
anpaure / cp_eval
Tiny evaluation of leading LLMs on competitive programming problems
☆14Updated 6 months ago