TheProxyCompany / proxy-inference-engine
Optimized LLM inference for Apple Silicon using MLX.
☆10Updated this week
Alternatives and similar repositories for proxy-inference-engine
Users that are interested in proxy-inference-engine are comparing it to the libraries listed below
Sorting:
- ☆46Updated last month
- NanoGPT-speedrunning for the poor T4 enjoyers☆65Updated 3 weeks ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆98Updated 2 months ago
- Simple repository for training small reasoning models☆27Updated 3 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Updated 2 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆75Updated 2 weeks ago
- ☆27Updated 10 months ago
- LLM attention pattern visualizer☆10Updated last year
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆99Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 3 months ago
- LLMs represent numbers on a helix and manipulate that helix to do addition.☆24Updated 3 months ago
- ☆13Updated last week
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆50Updated 3 months ago
- alternative way to calculating self attention☆18Updated 11 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆44Updated 8 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆97Updated 3 weeks ago
- ☆20Updated 5 months ago
- look how they massacred my boy☆63Updated 7 months ago
- Because it's there.☆16Updated 7 months ago
- Collection of autoregressive model implementation☆85Updated 3 weeks ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 3 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated last month
- Exploration into the Firefly algorithm in Pytorch☆38Updated 3 months ago
- ☆30Updated last week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆65Updated last month
- Lottery Ticket Adaptation☆39Updated 5 months ago
- Latent Large Language Models☆18Updated 8 months ago
- Focused on fast experimentation and simplicity☆72Updated 4 months ago
- Training hybrid models for dummies.☆21Updated 4 months ago