FareedKhan-dev / train-llm-from-scratchLinks
A straightforward method for training your LLM, from downloading data to generating text.
☆438Updated 2 months ago
Alternatives and similar repositories for train-llm-from-scratch
Users that are interested in train-llm-from-scratch are comparing it to the libraries listed below
Sorting:
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆186Updated last year
- ☆654Updated last week
- Model Activity Visualiser☆518Updated 6 months ago
- Building a GPT-like LLM from scratch with PyTorch.☆297Updated 9 months ago
- A Deep Research agent from scratch☆211Updated 4 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆186Updated last year
- Educational implementation of a small GPT model from scratch in a single Jupyter Notebook☆110Updated 7 months ago
- Building DeepSeek R1 from Scratch☆704Updated 6 months ago
- A step by step implementation of a complex RAG pipeline to solve real world situations☆305Updated 3 months ago
- 😎 Awesome list of Retrieval-Augmented Generation (RAG) applications in Generative AI.☆679Updated 2 months ago
- Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs…☆473Updated 10 months ago
- Build datasets using natural language☆529Updated 3 weeks ago
- ☆258Updated last month
- Building LLaMA 4 MoE from Scratch☆64Updated 5 months ago
- Ollama's Interactive Prompt Engineering Tutorial☆256Updated 10 months ago
- This repository provides a Python script to fetch and summarize research papers from arXiv using the free Gemini API☆244Updated 7 months ago
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆72Updated 6 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆298Updated 2 months ago
- a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model☆159Updated last year
- Library for model distillation☆153Updated last month
- CPU inference for the DeepSeek family of large language models in C++☆313Updated last week
- Maximizing the Performance of a Simple RAG using RL☆81Updated 6 months ago
- A Straightforward, Step-by-Step Implementation of a Video Diffusion Model☆59Updated last month
- Make any LLM to think like OpenAI o1 and deepseek R1☆489Updated 8 months ago
- A flexible, adaptive classification system for dynamic text classification☆463Updated 2 weeks ago
- Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.☆667Updated 6 months ago
- Hands-on tutorials on fine-tuning various LLMs using different fine-tuning techniques