BAI-Yeqi / Statistical-Properties-of-Dot-ProductLinks

☆16

Alternatives and similar repositories for Statistical-Properties-of-Dot-Product

Users that are interested in Statistical-Properties-of-Dot-Product are comparing it to the libraries listed below

Sorting:

HKUNLP / efficient-attention
[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling
☆86Updated 2 years ago
ZhuiyiTechnology / GAU-alpha
基于Gated Attention Unit的Transformer模型（尝鲜版）
☆98Updated 2 years ago
microsoft / Stochastic-Mixture-of-Experts
This package implements THOR: Transformer with Stochastic Experts.
☆65Updated 4 years ago
bojone / tiger
A Tight-fisted Optimizer
☆50Updated 2 years ago
DRSY / EMO
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
☆126Updated last year
QingruZhang / PLATON
This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).
☆46Updated 3 years ago
EdisonLeeeee / ICLR2022-OpenReviewData
ICLR 2022 Paper submission trend analysis from https://openreview.net/group?id=ICLR.cc/2022/Conference
☆85Updated 3 years ago
VITA-Group / Random-MoE-as-Dropout
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆55Updated 2 years ago
TsinghuaC3I / SoRA
[EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models
☆82Updated last year
lancopku / FedMNMT
[Findings of ACL 2023] Communication Efficient Federated Learning for Multilingual Machine Translation with Adapter
☆12Updated 2 years ago
thunlp / MoEfication
☆140Updated last year
hyp1231 / ICLR2023-OpenReviewData
Crawl & visualize ICLR papers and reviews.
☆18Updated 2 years ago
RunxinXu / ContrastivePruning
Source code for our AAAI'22 paper 《From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression》
☆25Updated 3 years ago
OpenNLPLab / cosFormer
[ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention
☆196Updated 2 years ago
thuml / LogME
Code release for "LogME: Practical Assessment of Pre-trained Models for Transfer Learning" (ICML 2021) and Ranking and Tuning Pre-trained…
☆209Updated 2 years ago
pkunlp-icler / ChildTuning
☆33Updated 4 years ago
SimiaoZuo / MoEBERT
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).
☆112Updated 3 years ago
bojone / LST-CLUE
Ladder Side-Tuning在CLUE上的简单尝试
☆22Updated 3 years ago
anshuman23 / InfDataSel
Code for paper: “What Data Benefits My Classifier?” Enhancing Model Performance and Interpretability through Influence-Based Data Selecti…
☆24Updated last year
zzp1012 / LLFC
[NeurIPS 2023] Code release for "Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity"
☆19Updated 2 years ago
victorchen96 / Draw-Paper-Plot-Using-Seaborn
some examples for drawing illustration plots for paper using seaborn package
☆15Updated 6 years ago
TobiasLee / Awesome-Efficient-PLM
Must-read papers on improving efficiency for pre-trained language models.
☆105Updated 2 years ago
lancopku / well-classified-examples-are-underestimated
Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"
☆53Updated 3 years ago
hazdzz / tiger
A Tight-fisted Optimizer (Tiger), implemented in PyTorch.
☆12Updated last year
weigq / neurips2021_stats
☆35Updated 3 years ago
princeton-nlp / CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
☆197Updated 2 years ago
MaHuanAAA / g_fair_prompting
☆33Updated 2 years ago
floatingCatty / BAAI-Monthly-
Comprehensive and precise reviews of advance AI subfields.
☆56Updated 4 years ago
twinkle0331 / LGTM
[ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…
☆38Updated 2 years ago
transformer-vq / transformer_vq
☆196Updated last year