facebookresearch / FBTT-EmbeddingLinks

This is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as recommendation and natural language processing. We showed this library can reduce the total model size by up to 100x in Facebook’s open sourced DLRM model while achieving same model quality. Our implementation …

☆194

Alternatives and similar repositories for FBTT-Embedding

Users that are interested in FBTT-Embedding are comparing it to the libraries listed below

Sorting:

spcl / substation
Research and development for optimizing transformers
☆129Updated 4 years ago
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆133Updated last month
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆259Updated 2 years ago
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆132Updated 3 years ago
mlcommons / training_results_v0.7
This repository contains the results and code for the MLPerf™ Training v0.7 benchmark.
☆57Updated 2 years ago
pytorch / rfcs
PyTorch RFCs (experimental)
☆133Updated last month
TezRomacH / layer-to-layer-pytorch
PyTorch implementation of L2L execution algorithm
☆107Updated 2 years ago
PersiaML / PERSIA
High performance distributed framework for training deep learning recommendation models based on PyTorch.
☆409Updated last month
kaiyuyue / torchshard
Slicing a PyTorch Tensor Into Parallel Shards
☆299Updated last month
ptillet / torch-blocksparse
Block-sparse primitives for PyTorch
☆157Updated 4 years ago
marsupialtail / sparsednn
Fast sparse deep learning on CPUs
☆53Updated 2 years ago
microsoft / varuna
☆251Updated 11 months ago
octoml / synr
A library for syntactically rewriting Python programs, pronounced (sinner).
☆69Updated 3 years ago
harvard-acc / DeepRecSys
http://vlsiarch.eecs.harvard.edu/research/recommendation/
☆136Updated 2 years ago
VoVAllen / tf-dlpack
DLPack for Tensorflow
☆35Updated 5 years ago
graphcore / tutorials
Training material for IPU users: tutorials, feature examples, simple applications
☆86Updated 2 years ago
NVIDIA-Merlin / distributed-embeddings
distributed-embeddings is a library for building large embedding based models in Tensorflow 2.
☆44Updated last year
NVIDIA / PyProf
A GPU performance profiling tool for PyTorch models
☆503Updated 4 years ago
snuspl / parallax
A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.
☆132Updated 3 years ago
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆270Updated 3 years ago
graphcore / poptorch
PyTorch interface for the IPU
☆180Updated last year
ezyang / stride-visualizer
Stride visualizations
☆37Updated 7 years ago
huggingface / pytorch_block_sparse
Fast Block Sparse Matrices for Pytorch
☆548Updated 4 years ago
bytedance / effective_transformer
Running BERT without Padding
☆472Updated 3 years ago
YannDubs / Hash-Embeddings
PyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.
☆199Updated 6 years ago
pytorch / torchdistx
Torch Distributed Experimental
☆116Updated 11 months ago
triton-inference-server / hugectr_backend
☆55Updated last year
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆41Updated 2 years ago
utsaslab / MONeT
MONeT framework for reducing memory consumption of DNN training
☆173Updated 4 years ago