PASSIONLab/distributed_sddmm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PASSIONLab/distributed_sddmm)

PASSIONLab / distributed_sddmm

Distributed SDDMM Kernel

☆12

Alternatives and similar repositories for distributed_sddmm

Users that are interested in distributed_sddmm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fw-ai / llama-cuda-graph-example
View on GitHub
Example of applying CUDA graphs to LLaMA-v2
☆11Aug 25, 2023Updated 2 years ago
daochenzha / neuroshard
View on GitHub
[MLSys 2023] Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
☆16May 5, 2023Updated 3 years ago
efmkoene / AML-2018-cheatsheet
View on GitHub
Cheatsheet for Advanced Machine Learning exam @ ETH Zürich, 2018-2019.
☆11Jan 23, 2019Updated 7 years ago
Mudit7 / CUDA-ResNet
View on GitHub
☆13May 8, 2020Updated 6 years ago
escalab / GPTPU
View on GitHub
GPTPU for SC 2021
☆52Mar 22, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
robjsliwa / llama-agent
View on GitHub
Fun project to run your own LLM chat bot using llama.cpp
☆11Jun 9, 2023Updated 3 years ago
marcsous / gpuSparse
View on GitHub
Matlab mex wrappers to cuSPARSE (NVIDIA)
☆11Dec 10, 2025Updated 7 months ago
oresths / tSparse
View on GitHub
A GPU algorithm for sparse matrix-matrix multiplication
☆74Oct 1, 2020Updated 5 years ago
ceruleangu / Block-Sparse-Benchmark
View on GitHub
Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.
☆23Aug 21, 2020Updated 5 years ago
renll / SparseLT
View on GitHub
[EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing
☆14Feb 10, 2023Updated 3 years ago
apuaaChen / vectorSparse
View on GitHub
☆32Aug 24, 2022Updated 3 years ago
microsoft / dist-ir
View on GitHub
An IR for efficiently simulating distributed ML computation.
☆33Jan 13, 2024Updated 2 years ago
ldbc / ldbc_finbench_datagen
View on GitHub
A synthetic graph generator on spark for the LDBC Financial Benchmark, featured as temporal graph
☆15Apr 12, 2026Updated 3 months ago
mlcommons / mobile_open
View on GitHub
MLPerf Mobile benchmarks
☆15Apr 28, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
arch-simulator-sig / simulator-paper
View on GitHub
☆12Sep 18, 2024Updated last year
OliverRichter / normalized-attention
View on GitHub
Code publication to the paper "Normalized Attention Without Probability Cage"
☆17Nov 9, 2021Updated 4 years ago
lemariva / wipy2.0-GPS
View on GitHub
Connect a Ublox NEO-6M/NE0-M8N gps module to a WiPy2.0/3.0
☆10Apr 29, 2018Updated 8 years ago
isaacdlp / bitcoin
View on GitHub
Bitcoin trading examples with Backtrader
☆18Aug 24, 2018Updated 7 years ago
escalab / SIMD2
View on GitHub
☆31Jun 15, 2022Updated 4 years ago
RobertCsordas / moe_layer
View on GitHub
sigma-MoE layer
☆21Jan 5, 2024Updated 2 years ago
GATECH-EIC / LLM4HWDesign_Starting_Toolkit
View on GitHub
LLM4HWDesign Starting Toolkit
☆19Oct 4, 2024Updated last year
NVIDIA / cuEmbed
View on GitHub
CUDA Embedding Lookup Kernel Library
☆48Jun 26, 2026Updated 3 weeks ago
camhahu / bybit-trading-bot
View on GitHub
A trading bot for Bybit utilising APScheduler
☆16Apr 16, 2020Updated 6 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
mengyangniu / ogbn-papers100m-sage
View on GitHub
☆14Mar 1, 2021Updated 5 years ago
alugowski / fast_matrix_market
View on GitHub
Fast and full-featured Matrix Market I/O library for C++, Python, and R
☆91Aug 5, 2024Updated last year
lukedodd / JitCalc
View on GitHub
Mathematical expression evaluator with just in time code generation.
☆12Apr 7, 2013Updated 13 years ago
rusty1s / dotfiles
View on GitHub
☆27Jul 4, 2026Updated 2 weeks ago
soedinglab / b-lore
View on GitHub
Bayesian multiple logistic regression for GWAS meta-analysis
☆17Aug 20, 2025Updated 10 months ago
YusukeNagasaka / Batched-SpMM
View on GitHub
New batched algorithm for sparse matrix-matrix multiplication (SpMM)
☆16May 7, 2019Updated 7 years ago
JiangLiSJTU / token-ring
View on GitHub
☆13Jan 7, 2025Updated last year
NeuraChip / neurachip
View on GitHub
NeuraChip Accelerator Simulator
☆16Apr 26, 2024Updated 2 years ago
SuperLiaoXH / SystolicArray-2D-FP16
View on GitHub
基于FP16的二维脉动阵列电路设计
☆13Feb 23, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
n-getty / argo-shim
View on GitHub
☆16Jul 10, 2026Updated last week
OpenMOSE / RWKV-Infer
View on GitHub
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆51Oct 21, 2025Updated 8 months ago
ldbc / ldbc_finbench_docs
View on GitHub
The specification of the LDBC Financial Benchmark
☆19Jan 9, 2026Updated 6 months ago
PASSIONLab / CombBLAS
View on GitHub
The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …
☆82Jun 4, 2026Updated last month
grlee77 / python-cuda-cffi
View on GitHub
experimental python CFFI interface to NVIDIA's cuSOLVER and cuSPARSE libraries.
☆13Jul 16, 2020Updated 6 years ago
pskugit / custom-conv2d
View on GitHub
A study for a custom convolution layer in which the x and y components of an image pixel are added to the kernel inputs.
☆12Feb 21, 2020Updated 6 years ago
kyaso / py-v
View on GitHub
A cycle-accurate RISC-V CPU simulator + RTL modeling library in pure Python.
☆18Aug 27, 2025Updated 10 months ago