FareedKhan-dev / train-llm-from-scratchLinks

A straightforward method for training your LLM, from downloading data to generating text.

☆414

Alternatives and similar repositories for train-llm-from-scratch

Users that are interested in train-llm-from-scratch are comparing it to the libraries listed below

Sorting:

FareedKhan-dev / train-deepseek-r1
Building DeepSeek R1 from Scratch
☆670Updated 4 months ago
FareedKhan-dev / create-million-parameter-llm-from-scratch
Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.
☆181Updated last year
FareedKhan-dev / Building-llama3-from-scratch
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.
☆175Updated 11 months ago
attentionmech / mav
Model Activity Visualiser
☆517Updated 4 months ago
Curated-Awesome-Lists / awesome-llms-fine-tuning
Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs…
☆448Updated 8 months ago
liyuan24 / nanoDeepResearch
A Deep Research agent from scratch
☆201Updated 2 months ago
JohnMachado11 / Build-a-Large-Language-Model-from-Scratch
Building a GPT-like LLM from scratch with PyTorch.
☆274Updated 7 months ago
andysingal / llm-course
☆621Updated this week
FareedKhan-dev / rag-with-rl
Maximizing the Performance of a Simple RAG using RL
☆70Updated 4 months ago
harishsg993010 / LLM-Reasoner
Make any LLM to think like OpenAI o1 and deepseek R1
☆491Updated 6 months ago
dCaples / AutoDidact
Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.
☆650Updated 4 months ago
argilla-io / synthetic-data-generator
Build datasets using natural language
☆507Updated 2 months ago
kevinpdev / gpt-from-scratch
Educational implementation of a small GPT model from scratch in a single Jupyter Notebook
☆104Updated 5 months ago
codelion / adaptive-classifier
A flexible, adaptive classification system for dynamic text classification
☆353Updated 2 weeks ago
FareedKhan-dev / gpt4o-from-scratch
Implementation of a GPT-4o like Multimodal from Scratch using Python
☆69Updated 4 months ago
KRLabsOrg / LettuceDetect
LettuceDetect is a hallucination detection framework for RAG applications.
☆474Updated 2 months ago
horus-ai-labs / DistillFlow
Library for model distillation
☆148Updated 5 months ago
IntelLabs / RAG-FiT
Framework for enhancing LLMs for RAG tasks using fine-tuning.
☆747Updated 2 months ago
andrewkchan / deepseek.cpp
CPU inference for the DeepSeek family of large language models in C++
☆308Updated 2 months ago
anakin87 / qwen-scheduler-grpo
Train a Language Model with GRPO to create a schedule from a list of events and priorities
☆217Updated 3 months ago
Danielskry / Awesome-RAG
😎 Awesome list of Retrieval-Augmented Generation (RAG) applications in Generative AI.
☆558Updated 3 weeks ago
rodmarkun / SmolML
A fully functional and simple Machine Learning library made entirely from scratch with Python.
☆295Updated this week
MaxHastings / Kolo
The Fastest Way to Fine-Tune LLMs Locally
☆313Updated 4 months ago
pengfeng / ask.py
A simple Python program to implement the search-extract-summarize flow.
☆269Updated last month
FareedKhan-dev / train-llama4
Building LLaMA 4 MoE from Scratch
☆60Updated 3 months ago
adithya-s-k / VARAG
Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine
☆477Updated 2 weeks ago
AstraBert / llama-4-researcher
Turn topics into essays in seconds!
☆186Updated last month
shangshang-wang / Tina
Tina: Tiny Reasoning Models via LoRA
☆274Updated 2 months ago
SqueezeAILab / TinyAgent
[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!
☆429Updated 11 months ago
qixucen / atom
Atom of Thoughts for Markov LLM Test-Time Scaling
☆580Updated last month