archinetai / vat-pytorchLinks

Virtual Adversarial Training (VAT) techniques in PyTorch

☆17

Alternatives and similar repositories for vat-pytorch

Users that are interested in vat-pytorch are comparing it to the libraries listed below

Sorting:

mlpc-ucsd / BERT_Convolutions
(ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.
☆21Updated 3 years ago
CyndxAI / QKNorm
Code for the paper "Query-Key Normalization for Transformers"
☆49Updated 4 years ago
RobertCsordas / linear_layer_as_attention
The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …
☆16Updated 5 months ago
kyegomez / Reka-Torch
Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch
☆29Updated this week
maszhongming / ParaKnowTransfer
Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"
☆32Updated last year
RiTUAL-MBZUAI / DA_NER
“Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition” (EMNLP 2022)
☆16Updated 2 years ago
lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆50Updated 3 years ago
wyu-du / GP-VAE
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Pr…
☆26Updated 3 years ago
yangjackie / Topics-on-diffusion-generative-models
☆27Updated last month
MGheini / xattn-transfer-for-mt
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Tra…
☆32Updated 4 years ago
lucidrains / coco-lm-pytorch
Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch
☆46Updated 4 years ago
bdusell / stack-attention
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Updated last year
acmi-lab / pretraining-with-nonsense
Pretraining summarization models using a corpus of nonsense
☆13Updated 4 years ago
antofuller / configaformers
A python library for highly configurable transformers - easing model architecture search and experimentation.
☆49Updated 3 years ago
renll / SparseLT
[EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing
☆14Updated 2 years ago
lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Updated 3 years ago
facebookresearch / dmae_st
Directed masked autoencoders
☆14Updated 2 years ago
ShiZhengyan / PowerfulPromptFT
[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…
☆76Updated last year
allenai / better-promptability
☆11Updated 2 years ago
lucidrains / rela-transformer
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Updated 3 years ago
jenni-ai / T2FW
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆19Updated 3 years ago
pkuzengqi / Skyformer
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)
☆63Updated 3 years ago
yaohungt / TransformerDissection
[EMNLP'19] Summary for Transformer Understanding
☆53Updated 5 years ago
yikangshen / megablocks
☆20Updated last year
lucidrains / memory-transformer-xl
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
☆49Updated 5 years ago
microsoft / AMOS
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
☆25Updated 2 years ago
lzy1732008 / GaussionTransformer
For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》
☆28Updated 5 years ago
ThomasScialom / T0_continual_learning
Adding new tasks to T0 without catastrophic forgetting
☆33Updated 3 years ago
bojone / univae
基于Transformer的单模型、多尺度的VAE模型
☆57Updated 4 years ago
CAMTL / CA-MTL
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
☆57Updated 4 years ago