trestad / Factual-Recall-Mechanism
The code for paper Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models.
☆13Updated 5 months ago
Related projects: ⓘ
- Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'☆11Updated last month
- ☆42Updated 5 months ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆20Updated 2 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆28Updated 8 months ago
- ☆24Updated 3 months ago
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆21Updated 5 months ago
- "Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…☆19Updated 4 months ago
- Multi-modal code generation problems.☆15Updated 2 weeks ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆27Updated 4 months ago
- Listing some diffusion papers in NLP domain I have read, text generation is main, table will continue to be updated.☆23Updated last month
- Lightweight Adapting for Black-Box Large Language Models☆16Updated 7 months ago
- Code for Findings of EMNLP2023 paper "Coarse-to-Fine Dual Encoders are Better Frame Identification Learners"☆12Updated 11 months ago
- Repository for our paper "DeepEdit: Knowledge Editing as Decoding with Constraints". https://arxiv.org/abs/2401.10471☆15Updated 3 months ago
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆63Updated 3 months ago
- Implementation of the MATRIX framework (ICML 2024)☆36Updated 4 months ago
- PyTorch implementation of StableMask (ICML'24)☆11Updated 2 months ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆43Updated last week
- ☆40Updated 5 months ago
- ☆17Updated 3 months ago
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆30Updated 2 months ago
- ☆10Updated 3 weeks ago
- 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆52Updated 3 weeks ago
- a benckmark for evaluating logical reasoning of LLMs☆16Updated 7 months ago
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆15Updated 6 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆59Updated 7 months ago
- [ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatio…☆16Updated 3 months ago
- ☆32Updated 10 months ago
- Official implementation for the paper *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆57Updated 3 weeks ago
- A method of ensemble learning for heterogeneous large language models.☆25Updated last month
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆32Updated 10 months ago