LOG-postech / rethinking-LLM-pruning
β23Updated 4 months ago
Alternatives and similar repositories for rethinking-LLM-pruning:
Users that are interested in rethinking-LLM-pruning are comparing it to the libraries listed below
- π¨ Malet (Machine Learning Experiment Tool) is a tool for efficient machine learning experiment execution, logging, analysis, and plot maβ¦β17Updated last month
- Code for reproducing the results from arXiv paper "Critical Influence of Overparameterization on Sharpness-aware Minimization"β14Updated 7 months ago
- Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)β10Updated 4 months ago
- [EMNLP 2022] TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Modelsβ70Updated 9 months ago
- [ICLR 2022] Towards Continual Knowledge Learning of Language Modelsβ92Updated 2 years ago
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)β14Updated last year
- β23Updated last year
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Modelsβ79Updated 5 months ago
- β20Updated last year
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Spβ¦β20Updated 11 months ago
- β23Updated last year
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)β56Updated 4 months ago
- β14Updated 11 months ago
- β64Updated 2 years ago
- KAIST AI605 Deep Learning for NLPβ31Updated 2 years ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".β95Updated last year
- β61Updated last year
- CharFormer(Tay et al., 2022; Gradient-based Subword Tokenizer + T5) model implementation for Huggingface Transformersβ21Updated 4 months ago
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]β17Updated 9 months ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformersβ56Updated 3 weeks ago
- νκ΅μ΄ μμ± λ¬Έμμ μμ μ¬μ€ κ΄κ³μ λν μ€λͺ κΈ°μβ14Updated 2 months ago
- β10Updated 5 months ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasksβ¦β37Updated 2 months ago
- [ACL 2021] Learning to Perturb Word Embeddings for Out-of-distribution QAβ16Updated 2 years ago
- demo page of krafton virtual Sherlockβ7Updated last year
- Official repository of "HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning", Findings of EMNLP 2023β21Updated last year
- Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"β38Updated 8 months ago
- [TACL 2024] Improving Probability-based Prompt Selection Through Unified Evaluation and Analysisβ10Updated 3 months ago
- [EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewardsβ48Updated 9 months ago
- β50Updated last year