arshadshk / Position-Prediction-PretrainingLinks
Position Prediction as an Effective Pretraining Strategy
☆8Updated 2 years ago
Alternatives and similar repositories for Position-Prediction-Pretraining
Users that are interested in Position-Prediction-Pretraining are comparing it to the libraries listed below
Sorting:
- ☆14Updated 3 years ago
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆9Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last month
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- ☆51Updated last year
- Model Stock: All we need is just a few fine-tuned models☆119Updated 10 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆44Updated last year
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Updated 10 months ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19Updated 2 years ago
- [Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.☆14Updated 2 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Updated 2 years ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆43Updated last year
- Structured Pruning Adapters in PyTorch☆18Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆40Updated last year
- ☆13Updated 3 years ago
- Code for T-MARS data filtering☆35Updated last year
- ☆20Updated last year
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆88Updated 2 years ago
- codebase for the SIMAT dataset and evaluation☆38Updated 3 years ago
- Official repository for Fourier model that can generate periodic signals☆10Updated 3 years ago
- ☆16Updated 2 years ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆31Updated 2 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Updated 2 years ago
- Official PyTorch implementation of "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data" (NeurIPS'23)☆15Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆31Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆119Updated 4 years ago
- [COLM 2024] Early Weight Averaging meets High Learning Rates for LLM Pre-training☆17Updated 9 months ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆16Updated 2 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆11Updated 7 months ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆30Updated 9 months ago