in this repository, i'm going to implement increasingly complex llm inference optimizations
☆84May 22, 2025Updated 9 months ago
Alternatives and similar repositories for llm-inference-optimizations-explained
Users that are interested in llm-inference-optimizations-explained are comparing it to the libraries listed below
Sorting:
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- ☆15Apr 26, 2025Updated 10 months ago
- Code from the CMU LM inference fall 2025 edition.☆34Dec 7, 2025Updated 2 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 4 months ago
- ☆13Dec 21, 2025Updated 2 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- i will automate factorio☆113Jul 31, 2024Updated last year
- Project code for training LLMs to write better unit tests + code☆21May 19, 2025Updated 9 months ago
- Generative Modeling via Drifting in MLX☆42Feb 6, 2026Updated 3 weeks ago
- Model Activity Visualiser☆521Apr 9, 2025Updated 10 months ago
- Faster Whisper ASR transcription with CTranslate2☆24Oct 25, 2024Updated last year
- Re-implementation of local descriptor HardNet training in fasta2+kornia☆21Apr 6, 2020Updated 5 years ago
- Collection of autoregressive model implementation☆85Feb 23, 2026Updated last week
- ☆44Updated this week
- Collection of resources for RL and Reasoning☆27Feb 3, 2025Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- we have ai at home☆72Feb 18, 2026Updated 2 weeks ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Apr 8, 2024Updated last year
- Python Script for Structuring data from SEC Form D filings using DuckDB and Python with a display layer using Evidence☆28Aug 17, 2024Updated last year
- Learn CUDA with PyTorch☆231Feb 23, 2026Updated last week
- ☆30Jan 26, 2023Updated 3 years ago
- Educational WIP☆68Feb 16, 2026Updated 2 weeks ago
- ☆71Aug 27, 2024Updated last year
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Jul 31, 2024Updated last year
- Lottery Ticket Adaptation☆40Nov 20, 2024Updated last year
- ☆80Jun 5, 2024Updated last year
- Retrieve the source code for any model made available on replicate.com!☆36Jan 22, 2024Updated 2 years ago
- ☆28Dec 3, 2025Updated 3 months ago
- Realtime voice agents for role play and more.☆41Mar 7, 2025Updated 11 months ago
- 详细双语注释版word2vec源码,well-annotated word2vec☆10Oct 3, 2021Updated 4 years ago
- Web browser version of StarCoder.cpp☆46Jul 30, 2023Updated 2 years ago
- ☆66Aug 5, 2025Updated 7 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆43Feb 27, 2025Updated last year
- ☆38Mar 12, 2024Updated last year
- Commit0: Library Generation from Scratch☆187Feb 24, 2026Updated last week
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried …☆14Jan 10, 2024Updated 2 years ago
- [CVPR 2021] FMO Deblurring Benchmark☆13Jan 12, 2022Updated 4 years ago
- A salesforce library designed to provide idiomatic clojure representations of salesforce data and metadata☆11Jan 14, 2020Updated 6 years ago