tensorwavecloud / ScalarLMLinks
ScalarLM - a unified training and inference stack
☆55Updated 3 weeks ago
Alternatives and similar repositories for ScalarLM
Users that are interested in ScalarLM are comparing it to the libraries listed below
Sorting:
- Cray-LM unified training and inference stack.☆22Updated 7 months ago
- SIMD quantization kernels☆83Updated last week
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆111Updated this week
- PyTorch Single Controller☆374Updated this week
- ☆220Updated 2 months ago
- ☆206Updated this week
- look how they massacred my boy☆64Updated 10 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆51Updated this week
- Where GPUs get cooked 👩🍳🔥☆279Updated 3 weeks ago
- ☆19Updated last year
- ☆217Updated 7 months ago
- ☆66Updated 3 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆52Updated last week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆88Updated this week
- train with kittens!☆62Updated 10 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆191Updated 3 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆389Updated 2 weeks ago
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆47Updated 6 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆290Updated last week
- 👷 Build compute kernels☆119Updated this week
- An introduction to LLM Sampling☆79Updated 8 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 6 months ago
- Training-Ready RL Environments + Evals☆65Updated this week
- ☆238Updated this week
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆155Updated 3 months ago
- Storing long contexts in tiny caches with self-study☆145Updated last week
- NanoGPT-speedrunning for the poor T4 enjoyers☆69Updated 4 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 3 months ago
- Google TPU optimizations for transformers models☆120Updated 7 months ago
- ☆39Updated last year