arshadshk / Position-Prediction-Pretraining
Position Prediction as an Effective Pretraining Strategy
☆8Updated 2 years ago
Alternatives and similar repositories for Position-Prediction-Pretraining
Users that are interested in Position-Prediction-Pretraining are comparing it to the libraries listed below
Sorting:
- ☆19Updated 10 months ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- ☆14Updated 3 years ago
- Official code for the paper "Attention as a Hypernetwork"☆33Updated 10 months ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆37Updated 3 years ago
- ☆31Updated 6 months ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Updated 8 months ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆14Updated 8 months ago
- ☆23Updated 7 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 3 weeks ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆64Updated last year
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆20Updated last year
- Code for paper: "LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits"☆13Updated 7 months ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆39Updated 9 months ago
- ☆21Updated 2 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- Official repository for Fourier model that can generate periodic signals☆10Updated 3 years ago
- ☆13Updated 2 years ago
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Updated 2 years ago
- Structured Pruning Adapters in PyTorch☆17Updated last year
- [Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.☆14Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆20Updated last year
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆9Updated 4 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆44Updated last year
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 8 months ago
- ☆16Updated 9 months ago
- ☆20Updated 2 years ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Updated 2 years ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆13Updated 3 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated last year