FareedKhan-dev / train-llm-from-scratchLinks
A straightforward method for training your LLM, from downloading data to generating text.
☆479Updated 4 months ago
Alternatives and similar repositories for train-llm-from-scratch
Users that are interested in train-llm-from-scratch are comparing it to the libraries listed below
Sorting:
- Building DeepSeek R1 from Scratch☆727Updated 8 months ago
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆192Updated last year
- Building a GPT-like LLM from scratch with PyTorch.☆318Updated 11 months ago
- Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs…☆490Updated last year
- ☆712Updated last week
- Model Activity Visualiser☆519Updated 8 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆191Updated last year
- Building LLaMA 4 MoE from Scratch☆68Updated 7 months ago
- A Deep Research agent from scratch☆212Updated 6 months ago
- Maximizing the Performance of a Simple RAG using RL☆87Updated 8 months ago
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆74Updated 8 months ago
- A step by step implementation of a complex RAG pipeline to solve real world situations☆364Updated 5 months ago
- Converting Unstructured Data to a Knowledge Graph: An End-to-End Pipeline☆276Updated 7 months ago
- A Straightforward, Step-by-Step Implementation of a Video Diffusion Model☆67Updated 3 months ago
- ☆60Updated 4 months ago
- Build datasets using natural language☆548Updated 2 months ago
- Deep research agent to help you find the best GitHub repositories 🕵️!☆823Updated 2 weeks ago
- Make any LLM to think like OpenAI o1 and deepseek R1☆492Updated 10 months ago
- Educational implementation of a small GPT model from scratch in a single Jupyter Notebook☆117Updated 9 months ago
- dLLM: Simple Diffusion Language Modeling☆1,261Updated this week
- Implement a reasoning LLM in PyTorch from scratch, step by step☆2,135Updated 2 weeks ago
- CPU inference for the DeepSeek family of large language models in C++☆315Updated 2 months ago
- This repo compiles a collection of examples that demonstrate the effective use of the ReAct pattern in LLM prompting. It includes variati…☆145Updated 6 months ago
- A command-line interface tool for serving LLM using vLLM.☆454Updated last week
- a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model☆164Updated last year
- ☆622Updated 9 months ago
- Train LLM Model Behavior☆667Updated this week
- ☆2,197Updated last week
- Train a Language Model with GRPO to create a schedule from a list of events and priorities☆250Updated 7 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆1,079Updated last week