FareedKhan-dev/create-million-parameter-llm-from-scratch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FareedKhan-dev/create-million-parameter-llm-from-scratch)

FareedKhan-dev / create-million-parameter-llm-from-scratch

Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.

☆212

Alternatives and similar repositories for create-million-parameter-llm-from-scratch

Users that are interested in create-million-parameter-llm-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FareedKhan-dev / Building-llama3-from-scratch
View on GitHub
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.
☆212Aug 23, 2024Updated last year
FareedKhan-dev / Understanding-Transformers-Step-by-Step-math-example
View on GitHub
Understanding Large Language Transformer Architecture like a child
☆34Apr 3, 2024Updated 2 years ago
FareedKhan-dev / train-tiny-llm
View on GitHub
Train a 29M parameter GPT from Scratch
☆48Mar 4, 2025Updated last year
yarikama / Agentic-Advanced-RAG
View on GitHub
Building a multi-agent RAG system with advanced RAG methods
☆13Jan 12, 2025Updated last year
FareedKhan-dev / create-stable-diffusion-from-scratch
View on GitHub
Implemented a stable diffusion architecture using PyTorch.
☆91Jan 3, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
mandliya / PMPP_notes
View on GitHub
Notes and code for Programming Massively Parallel Processors
☆13Mar 29, 2025Updated last year
Sayandip170900 / CUDA-Challenge
View on GitHub
100 Days of GPU Challenge
☆26Nov 15, 2025Updated 8 months ago
AI-ANK / PaLM-Kosmos-Vision
View on GitHub
PaLM-Kosmos-Vision is a foundational project showcasing basic ChatGPT with vision capabilities, inviting further development for advanced…
☆16Nov 15, 2023Updated 2 years ago
bhancockio / bhancockio-crewai-plus-crash-course
View on GitHub
☆15Apr 21, 2024Updated 2 years ago
mallahyari / modernbert-semantic-search
View on GitHub
☆12Jan 24, 2025Updated last year
ngtranminhtuan / LLMOPS
View on GitHub
NLP/LLM Mlops Pipeline to dev/train/evaluation, scalable deploy and monitoring systems.
☆22Mar 15, 2024Updated 2 years ago
keeeeenw / MicroLlama
View on GitHub
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆169Aug 11, 2025Updated 11 months ago
Aananda-giri / GPT2-Nepali
View on GitHub
☆12Feb 16, 2026Updated 5 months ago
vvr-rao / Training-a-Mini-114M-Parameter-Llama-3-like-Model-from-Scratch
View on GitHub
Trained a 114 million Parameter LLM from Scratch.
☆19Jul 21, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
bhancockio / trip-planner-with-crewai-2_0
View on GitHub
☆24Jun 12, 2024Updated 2 years ago
kyegomez / GPT3
View on GitHub
An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"
☆22Jun 29, 2024Updated 2 years ago
shankarlohar / dbt-snowflake-data-pipeline
View on GitHub
🚀 A structured data pipeline project using dbt and Snowflake to transform raw data into curated datasets. This project covers data inges…
☆14Mar 17, 2025Updated last year
nonacosa / gantt.js
View on GitHub
gantt-view-js extend Jquery
☆12Aug 16, 2017Updated 8 years ago
goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
vmarinowski / infini-attention
View on GitHub
An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'
☆56Aug 19, 2024Updated last year
samhoooo / chatapp-ollama
View on GitHub
A full-stack web chatbot application integrated with Ollama
☆12Jul 31, 2024Updated last year
robertpi / PicoMvc
View on GitHub
A thin veneer of F#ness arround several different frameworks to make a light weight Mvc framework.
☆17Sep 5, 2011Updated 14 years ago
krishnaKanta2008 / PredictHub
View on GitHub
PredictHub is a sophisticated stock price prediction platform that combines machine learning with real-time market data analysis. The app…
☆14Aug 15, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
goddoe / RLYX
View on GitHub
A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.
☆38Aug 27, 2025Updated 10 months ago
pravin5551 / Ride-Karo
View on GitHub
We tried to make an app using which users can book Bike and one rider along with a bike can help customers to reach his/her destiny. This…
☆13Jul 14, 2021Updated 5 years ago
kevinhall1998 / prompt
View on GitHub
prompt提示词工程快速上手
☆29Aug 30, 2024Updated last year
teknium1 / RawTransform
View on GitHub
A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.
☆34May 29, 2023Updated 3 years ago
Mxoder / LLM-from-scratch
View on GitHub
一些 LLM 方面的从零复现笔记
☆256Apr 29, 2025Updated last year
tyler-romero / microR1
View on GitHub
Simple repository for training small reasoning models
☆51Feb 17, 2026Updated 5 months ago
arunpshankar / AgenticSearch
View on GitHub
AgenticSearch operates within an agentic workflow, utilizing Gemini 2.0 and an extensive tool registry to handle complex questions. By in…
☆32Jan 16, 2025Updated last year
andrewass / kalgos
View on GitHub
Algorithms and Data Structures written in Kotlin
☆21Aug 1, 2020Updated 5 years ago
krymlov / swe-jyotisa-lib
View on GitHub
swe-jyotisa-lib (beta version)
☆19May 4, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Rupesh-rkgit / FineTuning-and-Inference-Llama2
View on GitHub
Finetuning and Inference of Llama2 7b model on colab
☆14Jul 19, 2023Updated 3 years ago
FareedKhan-dev / train-deepseek-r1
View on GitHub
Building DeepSeek R1 from Scratch
☆782Mar 21, 2025Updated last year
pimpale / vulkan-triangle-v1
View on GitHub
A simple Vulkan Project written in C
☆10Feb 14, 2025Updated last year
reasonmethis / docdocgo-core
View on GitHub
Automate web research way beyond the first page of search results; curate knowledge bases to chat with.
☆45Jul 30, 2025Updated 11 months ago
mingyin0312 / RL4LLM
View on GitHub
RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct
☆31Feb 23, 2025Updated last year
zhanshijinwat / Steel-LLM
View on GitHub
Train a 1B LLM with 1T tokens from scratch by personal
☆810Apr 27, 2025Updated last year
FareedKhan-dev / gpt4o-from-scratch
View on GitHub
Implementation of a GPT-4o like Multimodal from Scratch using Python
☆78Apr 4, 2025Updated last year