ashwinktpu / StarPlatLinks

☆21

Alternatives and similar repositories for StarPlat

Users that are interested in StarPlat are comparing it to the libraries listed below

Sorting:

tanmaysachan / splitcompute
Split model weights and execute partially
☆4Updated 11 months ago
siboehm / ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
☆134Updated last year
charlesfrye / cuda-substrings
Because it's there.
☆16Updated 9 months ago
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆61Updated last month
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆67Updated last week
tokenbender / avataRL
rl from zero pretrain, can it be done? we'll see.
☆56Updated this week
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆66Updated 2 months ago
yash-srivastava19 / arrakis
Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
☆29Updated 2 months ago
nano-R1 / resources
Compiling useful links, papers, benchmarks, ideas, etc.
☆46Updated 3 months ago
axonn-ai / axonn
A parallel framework for training deep neural networks
☆61Updated 3 months ago
gpu-mode / discord-cluster-manager
Write a fast kernel and run it on Discord. See how you compare against the best!
☆46Updated this week
commit-0 / commit0
Commit0: Library Generation from Scratch
☆155Updated last month
kimbochen / md-blogs
A blog where I write about research papers and blog posts I read.
☆12Updated 7 months ago
pyember / ember
☆182Updated 2 months ago
UmerHA / triton_util
Make triton easier
☆46Updated last year
gau-nernst / learn-cuda
Learn CUDA with PyTorch
☆27Updated this week
jxmorris12 / bm25_pt
minimal pytorch implementation of bm25 (with sparse tensors)
☆101Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆53Updated 4 months ago
Edward-Sun / gpt-accelera
Simple and efficient pytorch-native transformer training and inference (batched)
☆76Updated last year
AnswerDotAI / fastkmeans
☆61Updated last week
structuredllm / syncode
Efficient and general syntactical decoding for Large Language Models
☆278Updated 2 weeks ago
ahxt / mini-r1-zero
☆20Updated 4 months ago
66RING / CritiPrefill
Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".
☆14Updated 9 months ago
tensara / tensara
Competitive GPU kernel optimization platform.
☆79Updated 2 weeks ago
cloneofsimo / ptx-tutorial-by-aislop
PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)
☆66Updated 3 months ago
PrimeIntellect-ai / pi-quant
SIMD quantization kernels
☆72Updated this week
Jaykef / Triton-nanoGPT
Custom triton kernels for training Karpathy's nanoGPT.
☆19Updated 8 months ago
aastroza / structured-generation-benchmark
Structured Generation Evals
☆12Updated 9 months ago
JoshuaPurtell / SmallBench
Small, simple agent task environments for training and evaluation
☆18Updated 7 months ago
xjdr-alt / mla_blog_translation
☆13Updated last year