Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.
☆207May 12, 2024Updated last year
Alternatives and similar repositories for create-million-parameter-llm-from-scratch
Users that are interested in create-million-parameter-llm-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆206Aug 23, 2024Updated last year
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- A straightforward method for training your LLM, from downloading data to generating text.☆565Aug 3, 2025Updated 9 months ago
- Train a 29M parameter GPT from Scratch☆37Mar 4, 2025Updated last year
- A Straightforward, Step-by-Step Implementation of a Video Diffusion Model☆82Aug 18, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Building LLMs from scratch following the book from S. Raschka☆34Mar 27, 2025Updated last year
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆167Aug 11, 2025Updated 8 months ago
- Llama from scratch, or How to implement a paper without crying☆581May 29, 2024Updated last year
- AgenticSearch operates within an agentic workflow, utilizing Gemini 2.0 and an extensive tool registry to handle complex questions. By in…☆31Jan 16, 2025Updated last year
- NLP/LLM Mlops Pipeline to dev/train/evaluation, scalable deploy and monitoring systems.☆22Mar 15, 2024Updated 2 years ago
- Notebooks from YouTube videos☆19Dec 27, 2021Updated 4 years ago
- Using the OpenAI Gym library, I implemented two reinforcement learning algorithms in the Frozen Lake environment.☆11Feb 10, 2024Updated 2 years ago
- Intelligent Help for Efficient Programming☆18Jan 11, 2024Updated 2 years ago
- ☆24Jun 12, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Chain MiniMax Speech + Nano Banana Pro + Wan 2.6 to generate videos from script segments. Built for the official Wan 2.6 release with fal…☆25Dec 19, 2025Updated 4 months ago
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆21Jun 29, 2024Updated last year
- code for tensorflow wide and deep codelab☆12Sep 23, 2016Updated 9 years ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'☆55Aug 19, 2024Updated last year
- A full-stack web chatbot application integrated with Ollama☆12Jul 31, 2024Updated last year
- Personal ChatGPT Allow you to enhance the Power of ChatGPT with your PERSONAL DATA using LangChain☆21Nov 26, 2023Updated 2 years ago
- A Step-by-Step Implementation of RAPTOR based RAG implementation☆40Sep 1, 2025Updated 8 months ago
- Building a GPT-like LLM from scratch with PyTorch.☆347Dec 20, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- LLM as World Models using Bayesian inference☆17May 27, 2025Updated 11 months ago
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 9 months ago
- Creating the DeepSeek V3 model from scratch☆28Mar 28, 2025Updated last year
- Implement a ChatGPT-like LLM in PyTorch from scratch, step by step☆91,948Apr 16, 2026Updated 3 weeks ago
- Microservice for user authentication, authorization based on JWT mechanism with role-based access control. Project implement Event Driven…☆28May 15, 2025Updated 11 months ago
- Intuitive RAG system on top of LllamaIndex☆15Nov 8, 2024Updated last year
- made a chatbot based on openai gpt model that can search google. made with langchain and gradio ui☆26Apr 14, 2023Updated 3 years ago
- LiteGPT: A 124M Small Language Model (SLM) pre-trained on FineWeb and fine-tuned on Alpaca.☆35Dec 16, 2025Updated 4 months ago
- 一些 LLM 方面的从零复现笔记☆250Apr 29, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Created and enhanced a local LLM training system on Apple Silicon with MLX and Metal API, overcoming the absence of CUDA support. Fine-tu…☆29May 29, 2024Updated last year
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆18Oct 13, 2025Updated 6 months ago
- Using RAG to generate data for model fine-tuning.☆13Apr 16, 2025Updated last year
- Building DeepSeek R1 from Scratch☆753Mar 21, 2025Updated last year
- Finetuning and Inference of Llama2 7b model on colab☆14Jul 19, 2023Updated 2 years ago
- flowchart dataset proposed with FR-DETR☆10Jun 28, 2022Updated 3 years ago
- Train a 1B LLM with 1T tokens from scratch by personal☆800Apr 27, 2025Updated last year