BYU-PCCL / prompt-compression-contrastive-codingLinks

Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"

☆14

Alternatives and similar repositories for prompt-compression-contrastive-coding

Users that are interested in prompt-compression-contrastive-coding are comparing it to the libraries listed below

Sorting:

lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆50Updated 3 years ago
yikangshen / megablocks
☆20Updated last year
RobertCsordas / linear_layer_as_attention
The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …
☆16Updated 4 months ago
CyndxAI / QKNorm
Code for the paper "Query-Key Normalization for Transformers"
☆49Updated 4 years ago
lucidrains / esbn-transformer
An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
☆16Updated 4 years ago
dangxingyu / rnn-icrag
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Updated last year
MikeWangWZHL / Zemi
Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings
☆16Updated 2 years ago
codekansas / rwkv
RWKV model implementation
☆38Updated 2 years ago
sustcsonglin / gated_linear_attention_layer
☆31Updated last year
ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated last year
renll / SeqBoat
[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
☆39Updated last year
jenni-ai / T2FW
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆19Updated 3 years ago
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
jxiw / BiGS
Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …
☆114Updated last year
abhishekpanigrahi1996 / transformer_in_transformer
☆45Updated 2 years ago
srush / tangent
Source-to-Source Debuggable Derivatives in Pure Python
☆15Updated last year
ThomasScialom / T0_continual_learning
Adding new tasks to T0 without catastrophic forgetting
☆33Updated 2 years ago
ermongroup / fast_feedforward_computation
Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021
☆27Updated 4 years ago
ethz-spylab / superhuman-ai-consistency
☆30Updated 2 years ago
ctlllll / reward_collapse
☆27Updated 2 years ago
prateeky2806 / ComPEFT
☆26Updated last year
ekinakyurek / google-research
Google Research
☆46Updated 2 years ago
HazyResearch / embroid
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
☆11Updated 2 years ago
ahennequ / pytorch-custom-mma
☆29Updated 3 years ago
shreyansh26 / Attention-Mask-Patterns
Using FlexAttention to compute attention with different masking patterns
☆46Updated last year
belindal / LaMPP
Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action
☆37Updated 2 years ago
microsoft / EfficientLongSequenceModeling
☆51Updated 2 years ago
janphilippfranken / sami
Self-Supervised Alignment with Mutual Information
☆21Updated last year
srush / mamba-scans
Blog post
☆17Updated last year
ylsung / vl-merging
PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"
☆37Updated 2 years ago