facebookresearch / NasRecLinks

NASRec Weight Sharing Neural Architecture Search for Recommender Systems

☆30

Alternatives and similar repositories for NasRec

Users that are interested in NasRec are comparing it to the libraries listed below

Sorting:

TencentARC / BEBR
Official code for "Binary embedding based retrieval at Tencent"
☆43Updated last year
MILVLG / mlc-imp
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
☆10Updated last year
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆98Updated 10 months ago
facebookresearch / adaptive_scheduling
Experimental scripts for researching data adaptive learning rate scheduling.
☆23Updated last year
facebookresearch / active_indexing
Official implementation of "Active Image Indexing"
☆59Updated 2 years ago
lucidrains / light-recurrent-unit-pytorch
Implementation of a Light Recurrent Unit in Pytorch
☆48Updated 10 months ago
lucidrains / kalman-filtering-attention
Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"
☆58Updated last year
NVlabs / EfficientDL
☆33Updated last month
onnx / neural-compressor
Model compression for ONNX
☆97Updated 8 months ago
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆120Updated 9 months ago
alenic / timm-models-explorer
Timm model explorer
☆41Updated last year
microsoft / ResiDual
ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802
☆95Updated last year
lucidrains / self-reasoning-tokens-pytorch
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
☆56Updated last year
huggingface / pixparse
Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
☆21Updated last year
kyegomez / Blockwise-Parallel-Transformer
32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.
☆48Updated 2 years ago
tridao / flash-attention-wheels
☆52Updated last year
lucidrains / sinkhorn-router-pytorch
Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise
☆37Updated 11 months ago
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆56Updated last week
crypdick / timm-lr-scheduler-explorer
A dashboard for exploring timm learning rate schedulers
☆19Updated 8 months ago
LiqunMa / FBI-LLM
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
☆50Updated last year
IST-DASLab / QIGen
Repository for CPU Kernel Generation for LLM Inference
☆26Updated 2 years ago
snu-mllab / LayerMerge
Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)
☆30Updated 11 months ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆110Updated 7 months ago
lucidrains / deep-cross-attention
Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch
☆90Updated 5 months ago
lucidrains / pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…
☆124Updated last year
eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated 10 months ago
lucidrains / simplicial-attention
Implementation of 2-simplicial attention proposed by Clift et al. (2019) and the recent attempt to make practical in Fast and Simplex, Ro…
☆42Updated 3 weeks ago
lucidrains / transformer-lm-gan
Explorations into adversarial losses on top of autoregressive loss for language modeling
☆37Updated 5 months ago
facebookresearch / coocmap
code for paper "Accessing higher dimensions for unsupervised word translation"
☆21Updated 2 years ago
frankxwang / dpo-prefix-sharing
DPO, but faster 🚀
☆44Updated 8 months ago