liyuan24 / deepseek_from_scratchLinks
☆14Updated 3 months ago
Alternatives and similar repositories for deepseek_from_scratch
Users that are interested in deepseek_from_scratch are comparing it to the libraries listed below
Sorting:
- 100 days of building GPU kernels!☆462Updated 2 months ago
- GPU Kernels☆188Updated 2 months ago
- ☆350Updated 3 months ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆378Updated 4 months ago
- ☆321Updated 6 months ago
- ☆459Updated this week
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆188Updated last month
- ☆179Updated 6 months ago
- Slides, notes, and materials for the workshop☆327Updated last year
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆309Updated this week
- ☆43Updated last month
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆30Updated this week
- This repository contains an exhaustive coverage of a hands on approach to PyTorch along side powerful tools to accelerate model tuning an…☆121Updated last week
- An extension of the nanoGPT repository for training small MOE models.☆160Updated 4 months ago
- LLaMA 2 implemented from scratch in PyTorch☆337Updated last year
- repo of paper implementations☆20Updated 4 months ago
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆360Updated 4 months ago
- Notes and commented code for RLHF (PPO)☆99Updated last year
- PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research …☆151Updated last week
- ☆200Updated 5 months ago
- making the official triton tutorials actually comprehensible☆45Updated 3 months ago
- Best practices & guides on how to write distributed pytorch training code☆450Updated 4 months ago
- Llama from scratch, or How to implement a paper without crying☆571Updated last year
- Learnings and programs related to CUDA☆411Updated 2 weeks ago
- Notes about LLaMA 2 model☆63Updated last year
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆497Updated last week
- Natural Language Processing Courses with Resources☆36Updated 8 months ago
- Apply GPU in ML and DL☆52Updated 4 months ago
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆348Updated 3 months ago
- A bibliography and survey of the papers surrounding o1☆1,205Updated 8 months ago