in this repository, i'm going to implement increasingly complex llm inference optimizations
☆85May 22, 2025Updated 11 months ago
Alternatives and similar repositories for llm-inference-optimizations-explained
Users that are interested in llm-inference-optimizations-explained are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python module to make coding hassle free!☆10Jun 1, 2021Updated 4 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- ☆15Jan 26, 2025Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆60Oct 18, 2025Updated 6 months ago
- ☆30Jun 20, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Collection of autoregressive model implementation☆85Feb 23, 2026Updated 2 months ago
- A C++ port of karpathy/micrograd, a tiny scalar-valued autograd engine and a neural net library☆13Nov 24, 2023Updated 2 years ago
- ☆14Dec 21, 2025Updated 4 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- Notes of ADRL course taught at IISC as part of MTech AI curriculum☆14Nov 30, 2024Updated last year
- a tiny vectorstore implementation built with numpy.☆64Apr 26, 2024Updated 2 years ago
- A tiny easily hackable implementation of a feature dashboard.☆16Oct 21, 2025Updated 6 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆17Mar 26, 2025Updated last year
- Project code for training LLMs to write better unit tests + code☆21May 19, 2025Updated 11 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆171Jul 31, 2024Updated last year
- Digital Signal Processing LABs for SUSTECH 2020 FALL (EE323).☆15Jan 2, 2021Updated 5 years ago
- Re-implementation of local descriptor HardNet training in fasta2+kornia☆21Apr 6, 2020Updated 6 years ago
- ☆12Sep 25, 2024Updated last year
- Stream of my favorite papers and links☆44Apr 19, 2026Updated 2 weeks ago
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- ☆10Apr 23, 2026Updated last week
- A locally trained model of Stoney Nakoda has been developed and released. You can access the working model here or train your own instanc…☆10Oct 29, 2025Updated 6 months ago
- Faster Whisper ASR transcription with CTranslate2☆25Oct 25, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Voila! A smart automatic pet feeder using Arduino Uno + RTC time module for scheduling + multiple sensors.☆10Jun 4, 2024Updated last year
- Retrieve the source code for any model made available on replicate.com!☆36Jan 22, 2024Updated 2 years ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆52Jul 4, 2025Updated 10 months ago
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing…☆10Oct 7, 2024Updated last year
- ☆79Nov 26, 2024Updated last year
- All the content of my youtube channel : https://youtube.com/@florenzerstling?si=7t10PBr6MDha74PO☆14May 28, 2025Updated 11 months ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆481Mar 10, 2025Updated last year
- Open sourced result for The Agent Company☆22Nov 11, 2025Updated 5 months ago
- Simulation of job offers and CVs with real-time processing, classification, and analytics using Kafka, Ray, Spark, and Databricks. Includ…☆14Dec 25, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A light tensor library in zig.☆77Feb 9, 2025Updated last year
- The Algoz on-chain captcha smart contract repository.☆10Aug 3, 2022Updated 3 years ago
- Learnings and programs related to CUDA☆436Jun 29, 2025Updated 10 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Apr 8, 2024Updated 2 years ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- ☆10Oct 22, 2024Updated last year
- ☆16Apr 29, 2025Updated last year