daochenzha / dreamshard
[NeurIPS 2022] DreamShard: Generalizable Embedding Table Placement for Recommender Systems
☆28Updated last year
Related projects: ⓘ
- [MLSys 2023] Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models☆16Updated last year
- [KDD 2022] AutoShard: Automated Embedding Table Sharding for Recommender Systems☆21Updated last year
- [ NeurIPS '22 ] Data distillation for recommender systems. Shows equivalent performance with 2-3 orders less data.☆22Updated last year
- ☆16Updated 2 years ago
- ☆22Updated 10 months ago
- AutoLossGen: Automatic Loss Function Generation for Recommender Systems☆22Updated 2 years ago
- pytorch open-source library for the paper "AdaTT Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations"☆40Updated last month
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆37Updated 3 months ago
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆40Updated last year
- ☆23Updated 5 months ago
- Official codebase for NeurIPS 2022 paper End-to-end Learning to Index and Search in Large Output Spaces☆11Updated last year
- Linear Attention Sequence Parallelism (LASP)☆64Updated 3 months ago
- Repository for "GIST: Distributed training for large-scale graph convolutional networks"☆14Updated last year
- ☆13Updated this week
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆24Updated last week
- Official code for the ICML2022 paper -- GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks☆49Updated last year
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆10Updated 3 weeks ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆77Updated last year
- Can GPT-4 Perform Neural Architecture Search?☆82Updated last year
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- ☆19Updated 2 years ago
- ☆13Updated 2 years ago
- Code for paper - On Diversified Preferences of Large Language Model Alignment☆14Updated last month
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆28Updated 4 months ago
- ICLR 2021☆42Updated 3 years ago
- Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances☆12Updated 2 years ago
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆49Updated 2 months ago
- The official implementation of "Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization"☆13Updated 6 months ago
- Hyperparameter tuning via uncertainty modeling☆46Updated 4 months ago
- Code for Neural Execution Engines: Learning to Execute Subroutines☆16Updated 3 years ago