victorchen96 / Draw-Paper-Plot-Using-Seaborn
some examples for drawing illustration plots for paper using seaborn package
☆13Updated 4 years ago
Related projects: ⓘ
- Mixture of Attention Heads☆36Updated last year
- The repository for our paper: Neighboring Perturbations of Knowledge Editing on Large Language Models☆13Updated 4 months ago
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆19Updated last year
- How Does Selective Mechanism Improve Self-attention Networks?☆27Updated 3 years ago
- Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".☆16Updated 2 years ago
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆38Updated last year
- Weighted Training for Cross-Task Learning☆15Updated last year
- code for Explicit Sparse Transformer☆57Updated last year
- ☆17Updated last year
- ☆32Updated 3 years ago
- Codes for Merging Large Language Models☆16Updated last month
- Implementation for Variational Information Bottleneck for Effective Low-resource Fine-tuning, ICLR 2021☆36Updated 3 years ago
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Updated 2 years ago
- Crawl & visualize ICLR papers and reviews.☆18Updated last year
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆11Updated last month
- Code for the PAPA paper☆27Updated last year
- [ACL 2023] Delving into the Openness of CLIP☆22Updated last year
- ☆27Updated last year
- This is the project for IRM methods☆11Updated 3 years ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆29Updated last year
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆34Updated 2 months ago
- Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"☆40Updated 2 years ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆28Updated 4 months ago
- [NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”, Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangya…☆27Updated 2 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆42Updated last year
- ☆29Updated 2 years ago
- ☆11Updated 6 months ago
- Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"☆41Updated 2 years ago
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated 10 months ago
- Learning to Encode Position for Transformer with Continuous Dynamical Model☆59Updated 4 years ago