FareedKhan-dev / train-llm-from-scratchLinks
A straightforward method for training your LLM, from downloading data to generating text.
☆455Updated 2 months ago
Alternatives and similar repositories for train-llm-from-scratch
Users that are interested in train-llm-from-scratch are comparing it to the libraries listed below
Sorting:
- Building DeepSeek R1 from Scratch☆713Updated 7 months ago
- ☆674Updated last week
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆189Updated last year
- Building a GPT-like LLM from scratch with PyTorch.☆306Updated 10 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆188Updated last year
- Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs…☆478Updated 10 months ago
- Model Activity Visualiser☆519Updated 6 months ago
- A Deep Research agent from scratch☆212Updated 5 months ago
- A step by step implementation of a complex RAG pipeline to solve real world situations☆325Updated 3 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine☆486Updated 3 months ago
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆73Updated 6 months ago
- Educational implementation of a small GPT model from scratch in a single Jupyter Notebook☆112Updated 8 months ago
- Building LLaMA 4 MoE from Scratch☆67Updated 6 months ago
- Converting Unstructured Data to a Knowledge Graph: An End-to-End Pipeline☆261Updated 6 months ago
- 😎 Awesome list of Retrieval-Augmented Generation (RAG) applications in Generative AI.☆759Updated 2 weeks ago
- Complete pipeline for Training Model Behavior in Agentic Systems☆635Updated last week
- Maximizing the Performance of a Simple RAG using RL☆82Updated 7 months ago
- Build datasets using natural language☆534Updated last month
- Make any LLM to think like OpenAI o1 and deepseek R1☆490Updated 8 months ago
- CPU inference for the DeepSeek family of large language models in C++☆314Updated 3 weeks ago
- LettuceDetect is a hallucination detection framework for RAG applications.☆507Updated last month
- Train a Language Model with GRPO to create a schedule from a list of events and priorities☆242Updated 6 months ago
- A simple Python program to implement the search-extract-summarize flow.☆273Updated 4 months ago
- ☆53Updated 3 months ago
- Implement a reasoning LLM in PyTorch from scratch, step by step☆1,828Updated this week
- Ollama's Interactive Prompt Engineering Tutorial☆259Updated 10 months ago
- A flexible, adaptive classification system for dynamic text classification☆480Updated 3 weeks ago
- Multimodal AI agent with Llama 3.2: A Streamlit app that processes text, images, PDFs, and PPTs, integrating NIM microservices, Milvus, a…☆133Updated last year
- A roadmap for "generative AI" learning resources☆287Updated last year
- [EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!☆455Updated last year