NVIDIA-Merlin / distributed-embeddingsLinks

distributed-embeddings is a library for building large embedding based models in Tensorflow 2.

☆46

Alternatives and similar repositories for distributed-embeddings

Users that are interested in distributed-embeddings are comparing it to the libraries listed below

Sorting:

NVIDIA-Merlin / HierarchicalKV
HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…
☆172Updated 2 weeks ago
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆273Updated last month
intel / torch-ccl
oneCCL Bindings for Pytorch*
☆102Updated 2 months ago
harvard-acc / DeepRecSys
http://vlsiarch.eecs.harvard.edu/research/recommendation/
☆135Updated 3 years ago
NVIDIA / nvtx-plugins
Python bindings for NVTX
☆66Updated 2 years ago
triton-inference-server / hugectr_backend
☆56Updated last year
PersiaML / PERSIA
High performance distributed framework for training deep learning recommendation models based on PyTorch.
☆409Updated 3 months ago
DeepRec-AI / HybridBackend
A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
☆158Updated last year
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆114Updated 2 years ago
facebookresearch / param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…
☆152Updated this week
rapidsai / wholegraph
WholeGraph - large scale Graph Neural Networks
☆104Updated 10 months ago
tensorflow / networking
Enhanced networking support for TensorFlow. Maintained by SIG-networking.
☆99Updated 3 years ago
linnanwang / superneurons-release
this is the release repository of superneurons
☆53Updated 4 years ago
google / nccl-fastsocket
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.
☆120Updated last year
tensorflow / custom-op
Guide for building custom op for TensorFlow
☆381Updated 2 years ago
TalwalkarLab / paleo
An analytical performance modeling tool for deep neural networks.
☆91Updated 5 years ago
Funatiq / gossip
gossip: Efficient Communication Primitives for Multi-GPU Systems
☆59Updated 3 years ago
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆134Updated 3 months ago
anilshanbhag / gpu-topk
Efficient Top-K implementation on the GPU
☆188Updated 6 years ago
ezyang / nvprof2json
Convert nvprof profiles into about:tracing compatible JSON files
☆70Updated 4 years ago
NVIDIA-Merlin / HugeCTR
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
☆1,035Updated 3 weeks ago
awslabs / raf
☆145Updated 8 months ago
msr-fiddle / pipedream
☆393Updated 2 years ago
tbd-ai / tbd-suite
☆47Updated 2 years ago
facebookresearch / dlrm_datasets
Set of datasets for the deep learning recommendation model (DLRM).
☆47Updated 2 years ago
SymbioticLab / Salus
Fine-grained GPU sharing primitives
☆144Updated 2 months ago
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆118Updated last year
NVIDIA / Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆357Updated this week
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
jiazhihao / TASO
The Tensor Algebra SuperOptimizer for Deep Learning
☆730Updated 2 years ago