PKU-ML / LongPPL
☆17Updated this week
Related projects ⓘ
Alternatives and complementary repositories for LongPPL
- Long Context Extension and Generalization in LLMs☆39Updated 2 months ago
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]☆15Updated 6 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆15Updated 5 months ago
- ☆36Updated 3 months ago
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View☆29Updated last month
- Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"☆55Updated last year
- Codebase for decoding compressed trust.☆20Updated 6 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆75Updated last month
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆32Updated 2 weeks ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆28Updated 3 weeks ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆69Updated last month
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆28Updated 4 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆118Updated 3 weeks ago
- ☆20Updated 4 months ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆69Updated last year
- Is In-Context Learning Sufficient for Instruction Following in LLMs?☆25Updated 5 months ago
- ☆43Updated 9 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆84Updated 5 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆24Updated 3 weeks ago
- Test-time-training on nearest neighbors for large language models☆27Updated 7 months ago
- ☆33Updated last year
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆14Updated 3 months ago
- Codebase for Instruction Following without Instruction Tuning☆31Updated last month
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆44Updated last year
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆30Updated 3 months ago
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)☆53Updated last month
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆63Updated last year
- Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).☆24Updated 2 months ago
- Methods and evaluation for aligning language models temporally☆24Updated 8 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆35Updated 7 months ago