rivercold / ACClip-PytorchLinks

The PyTorch implementation of ICLR2020 submission: Why ADAM Beats SGD for Attention Models (https://openreview.net/pdf?id=SJx37TEtDH)

☆8

Alternatives and similar repositories for ACClip-Pytorch

Users that are interested in ACClip-Pytorch are comparing it to the libraries listed below

Sorting:

yoonkim / neural-qcfg
☆45Updated 3 years ago
kernelmachine / demix
DEMix Layers for Modular Language Modeling
☆53Updated 3 years ago
archiki / GrIPS
Code for our paper: "GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models"
☆55Updated 2 years ago
lifu-tu / ENGINE
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
☆25Updated 4 years ago
violet-zct / swarm-distillation-zero-shot
☆22Updated 2 years ago
nng555 / ssmba
☆62Updated 3 years ago
zomux / lanmt-ebm
lanmt ebm
☆12Updated 5 years ago
bigscience-workshop / architecture-objective
☆97Updated last year
thunlp / DPT
☆13Updated 3 years ago
RobertCsordas / ndr
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".
☆33Updated last month
jacobandreas / geca
☆42Updated 4 years ago
fangleai / Implicit-LVM
This code repository presents the pytorch implementation of the paper “Implicit Deep Latent Variable Models for Text Generation”(EMNLP 20…
☆55Updated 3 years ago
MikeWangWZHL / Zemi
Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings
☆16Updated 2 years ago
RobertCsordas / transformer_generalization
The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…
☆67Updated 2 years ago
hsajjad / Interpretability-Tutorial-NAACL2021
☆24Updated 4 years ago
leo-liuzy / probe-across-time
☆22Updated 3 years ago
JunShern / few-shot-adaptation
Exploring Few-Shot Adaptation of Language Models with Tables
☆24Updated 2 years ago
salesforce / fast-influence-functions
☆89Updated 2 months ago
jungokasai / deep-shallow
☆44Updated 4 years ago
qkaren / COLD_decoding
☆108Updated 3 years ago
McGill-NLP / polytropon
☆54Updated 2 years ago
INK-USC / NExT
Source Code for paper "Learning from Explanations with Neural Execution Tree", ICLR 2020
☆18Updated 4 years ago
lena-voita / description-length-probing
This is a repository with the code for the EMNLP 2020 paper "Information-Theoretic Probing with Minimum Description Length"
☆71Updated 10 months ago
yzpang / gold-off-policy-text-gen-iclr21
☆50Updated 3 years ago
rosewang2008 / language_modeling_via_stochastic_processes
Language modeling via stochastic processes. Oral @ ICLR 2022.
☆138Updated 2 years ago
ekinakyurek / influence
Code for "Tracing Knowledge in Language Models Back to the Training Data"
☆38Updated 2 years ago
suzgunmirac / crowd-sampling
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding
☆18Updated 2 years ago
naver / gdc
Code accompanying our papers on the "Generative Distributional Control" framework
☆118Updated 2 years ago
AI-secure / InfoBERT
[ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Y…
☆85Updated last year
McGill-NLP / latent-translation
Code for the paper "Modelling Latent Translations for Cross-Lingual Transfer"
☆17Updated 3 years ago