Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.
☆209May 12, 2024Updated 2 years ago
Alternatives and similar repositories for create-million-parameter-llm-from-scratch
Users that are interested in create-million-parameter-llm-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆209Aug 23, 2024Updated last year
- Understanding Large Language Transformer Architecture like a child☆33Apr 3, 2024Updated 2 years ago
- Train a 29M parameter GPT from Scratch☆39Mar 4, 2025Updated last year
- A straightforward method for training your LLM, from downloading data to generating text.☆1,595May 22, 2026Updated last week
- Notes and code for Programming Massively Parallel Processors☆13Mar 29, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A Straightforward, Step-by-Step Implementation of a Video Diffusion Model☆83Aug 18, 2025Updated 9 months ago
- 100 Days of GPU Challenge☆26Nov 15, 2025Updated 6 months ago
- ☆15Apr 21, 2024Updated 2 years ago
- PaLM-Kosmos-Vision is a foundational project showcasing basic ChatGPT with vision capabilities, inviting further development for advanced…☆16Nov 15, 2023Updated 2 years ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆168Aug 11, 2025Updated 9 months ago
- Llama from scratch, or How to implement a paper without crying☆579May 29, 2024Updated 2 years ago
- AgenticSearch operates within an agentic workflow, utilizing Gemini 2.0 and an extensive tool registry to handle complex questions. By in…☆32Jan 16, 2025Updated last year
- NLP/LLM Mlops Pipeline to dev/train/evaluation, scalable deploy and monitoring systems.☆22Mar 15, 2024Updated 2 years ago
- ☆12Feb 16, 2026Updated 3 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Notebooks from YouTube videos☆19Dec 27, 2021Updated 4 years ago
- Trained a 114 million Parameter LLM from Scratch.☆19Jul 21, 2024Updated last year
- ☆24Jun 12, 2024Updated last year
- Chain MiniMax Speech + Nano Banana Pro + Wan 2.6 to generate videos from script segments. Built for the official Wan 2.6 release with fal…☆28Dec 19, 2025Updated 5 months ago
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆21Jun 29, 2024Updated last year
- An LLM-powered advanced RAG pipeline built from scratch☆859Jan 26, 2024Updated 2 years ago
- An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'☆56Aug 19, 2024Updated last year
- A Step-by-Step Implementation of RAPTOR based RAG implementation☆40Sep 1, 2025Updated 8 months ago
- An app to organize your research: A Paper Based Approach☆22Feb 26, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A thin veneer of F#ness arround several different frameworks to make a light weight Mvc framework.☆17Sep 5, 2011Updated 14 years ago
- Microservice for user authentication, authorization based on JWT mechanism with role-based access control. Project implement Event Driven…☆28May 15, 2025Updated last year
- Implement a ChatGPT-like LLM in PyTorch from scratch, step by step☆96,148May 23, 2026Updated last week
- Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.☆24Nov 26, 2025Updated 6 months ago
- Foundation Models for Geospatial Reasoning: Assessing the Capabilities of Large Language Models in Understanding Geometries and Topologic…☆30May 30, 2025Updated last year
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆32May 29, 2023Updated 3 years ago
- Simple repository for training small reasoning models☆52Feb 17, 2026Updated 3 months ago
- Created and enhanced a local LLM training system on Apple Silicon with MLX and Metal API, overcoming the absence of CUDA support. Fine-tu…☆29May 29, 2024Updated 2 years ago
- C# Word2Vec object with fast neighbor search. Format compatible with gensim☆25May 18, 2020Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- LLM query engine to retrieve augmented responses from json files.☆15Oct 12, 2023Updated 2 years ago
- This project contains a step-by-step guide on how to design an advanced agentic memory for your LLM based applications.☆54Apr 28, 2025Updated last year
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆18Oct 13, 2025Updated 7 months ago
- ☆15Jan 30, 2025Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- Using RAG to generate data for model fine-tuning.☆14Apr 16, 2025Updated last year
- GreenMe- GenAI app - Reduce Carbon Footprint for Greener Future.☆15Jan 3, 2025Updated last year