Build LLM from scratch
☆96Nov 19, 2025Updated 3 months ago
Alternatives and similar repositories for llm-from-scratch
Users that are interested in llm-from-scratch are comparing it to the libraries listed below
Sorting:
- GEMV implementation with CUTLASS☆19Aug 21, 2025Updated 6 months ago
- ☆32Jul 2, 2025Updated 8 months ago
- Efficient GPU communication over multiple NICs.☆26Nov 20, 2025Updated 3 months ago
- A Streaming-Native Serving Engine for TTS/STS Models☆58Feb 22, 2026Updated last week
- ☆88May 31, 2025Updated 9 months ago
- Gensis is a lightweight deep learning framework written from scratch in Python, with Triton as its backend for high-performance computing…☆37Jan 15, 2026Updated last month
- An experimental communicating attention kernel based on DeepEP.☆35Jul 29, 2025Updated 7 months ago
- ☆24Jul 7, 2024Updated last year
- Nex Venus Communication Library☆72Nov 17, 2025Updated 3 months ago
- ☆53Feb 24, 2026Updated last week
- All Resources from Stanford CS106B 2021☆24Jul 11, 2025Updated 7 months ago
- Source code for the paper "Memory-Efficient Fine-Tuning via Low-Rank Activation Compression"☆13Aug 1, 2025Updated 7 months ago
- Prefix-Aware Attention for LLM Decoding☆29Jan 23, 2026Updated last month
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆165Feb 11, 2026Updated 3 weeks ago
- I am assigned to a data collection task, to collect 2000 company information from [owler](www.owler.com) and [crunchbase](www.crunchbase.…☆10Feb 15, 2020Updated 6 years ago
- Fine-Grained Knowledge Fusion for Retrieval-Augmented Medical Visual Question☆11Jul 18, 2024Updated last year
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- netbeacon - monitoring your network capture, NIDS or network analysis process☆19Oct 26, 2013Updated 12 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- shadowsocks☆11Jun 15, 2019Updated 6 years ago
- Unified Sparse Library Wrapper Based on cuSPARSE☆12May 24, 2022Updated 3 years ago
- ☆11Jun 11, 2020Updated 5 years ago
- ☆54Mar 15, 2025Updated 11 months ago
- ☆12May 18, 2024Updated last year
- Peking University Convex Optimization Course given by Professor Wen Zaiwen☆11Jan 11, 2018Updated 8 years ago
- A Coq framework to support structural design and proof of hardware cache-coherence protocols☆14May 7, 2022Updated 3 years ago
- Multi-heap-sort for many small arrays, quicksort with 3 pivots for one big array, CUDA acceleration, CUDA memory compression.☆13Sep 29, 2024Updated last year
- ☆13Jan 21, 2022Updated 4 years ago
- A tool for cross-checking Verilog compilers☆14Apr 16, 2025Updated 10 months ago
- Kernel Module that implements Paxos protocol☆12Oct 23, 2020Updated 5 years ago
- Open-source audio embedding models, submitted to the HEAR 2021 challenge☆11Feb 15, 2026Updated 2 weeks ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- Analyzing NBA Data☆11Feb 19, 2015Updated 11 years ago
- ☆10Jun 28, 2025Updated 8 months ago
- PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.☆10Feb 10, 2022Updated 4 years ago
- Bioinformatics projects and code shared by Zhi John Lu☆10Jun 3, 2021Updated 4 years ago
- ☆11Sep 21, 2022Updated 3 years ago
- Distributed, Replicated, Protocol-generic Key-value Store in Async Rust For SMR Protocols Research☆17Updated this week
- official build specifications for jupyter☆13Jun 3, 2021Updated 4 years ago