FareedKhan-dev / train-tiny-llmLinks
Train a 29M parameter GPT from Scratch
☆31Updated 10 months ago
Alternatives and similar repositories for train-tiny-llm
Users that are interested in train-tiny-llm are comparing it to the libraries listed below
Sorting:
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆76Updated 9 months ago
- A Straightforward, Step-by-Step Implementation of a Video Diffusion Model☆72Updated 5 months ago
- Building LLaMA 4 MoE from Scratch☆72Updated 9 months ago
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆196Updated last year
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆197Updated last year
- Maximizing the Performance of a Simple RAG using RL☆90Updated 10 months ago
- UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities☆148Updated 8 months ago
- ☆104Updated 9 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37Updated 8 months ago
- A straightforward method for training your LLM, from downloading data to generating text.☆504Updated 5 months ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆181Updated 6 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆114Updated 9 months ago
- Composition of Multimodal Language Models From Scratch☆15Updated last year
- ☆40Updated last year
- Code for Bolmo: Byteifying the Next Generation of Language Models☆113Updated 3 weeks ago
- The official implementation of the paper "Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models".☆86Updated 9 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆169Updated 4 months ago
- An agent to generate stunning images :)☆23Updated 7 months ago
- Code for Medium blog posts☆105Updated 3 weeks ago
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆152Updated last year
- Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization and fairness analysis.☆26Updated last week
- A Demo of Cache-Augmented Generation (CAG) in an LLM☆119Updated 7 months ago
- ☆54Updated last week
- Let's discover films.☆28Updated 9 months ago
- Synthetic Data Generation using LLM via Argilla, Distilabel, ChatGPT, etc.☆30Updated last year
- All information and news with respect to Falcon-H1 series☆105Updated 3 months ago
- A method for steering llms to better follow instructions☆74Updated 5 months ago
- Building a GPT-like LLM from scratch with PyTorch.☆330Updated last year
- ☆26Updated last year
- Build your own RAG and run it locally on your laptop: ColBERT + DSPy + Streamlit☆59Updated last year