No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
☆29Feb 9, 2022Updated 4 years ago
Alternatives and similar repositories for SAGE
Users that are interested in SAGE are comparing it to the libraries listed below
Sorting:
- This is the repository for our EMNLP 2022 paper "The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains".☆10Jun 2, 2023Updated 2 years ago
- Code and dataset for Polyglot Prompting: Multilingual Multitask Prompt Training.☆18Dec 7, 2022Updated 3 years ago
- ☆14Oct 11, 2023Updated 2 years ago
- Code for COLING 2020 paper "Improving Document-level Sentiment Analysis with User and Product Context"☆11Apr 13, 2022Updated 3 years ago
- some mixture of experts architecture implementations☆26Mar 22, 2024Updated last year
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆46Oct 17, 2022Updated 3 years ago
- This repository contains some of the code used in the paper "Training Language Models with Langauge Feedback at Scale"☆27Mar 30, 2023Updated 2 years ago
- A Wasserstein Subsequence Kernel for Time Series.☆21Jun 17, 2024Updated last year
- ☆24Jun 12, 2023Updated 2 years ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Dec 23, 2024Updated last year
- A repo listing known open source voice tools, ordered by where they sit in the voice stack☆27Sep 23, 2022Updated 3 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated last year
- Formal representation and solving for Euclidean plane geometry problems.☆33Dec 19, 2025Updated 2 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated last year
- Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding☆28Jan 28, 2024Updated 2 years ago
- An environment for learning formal mathematical reasoning from scratch☆72Aug 18, 2024Updated last year
- This repository provides the data and the codes used in the AAAI'24 paper, COOPER: Coordinating Specialized Agents towards a Complex Dial…☆27Mar 1, 2024Updated 2 years ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Feb 9, 2026Updated 3 weeks ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆31Dec 6, 2023Updated 2 years ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆66Jun 29, 2023Updated 2 years ago
- Search comments and highlights annotations in PDF documents.☆12May 4, 2023Updated 2 years ago
- The official repo of continuous speculative decoding☆31Mar 28, 2025Updated 11 months ago
- Codebase for fine-tuning Llama2 70B to generate math test questions and answers.☆11Aug 30, 2024Updated last year
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach☆32Nov 6, 2023Updated 2 years ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- ☆213Oct 10, 2022Updated 3 years ago
- ☆30Dec 27, 2024Updated last year
- ☆84May 10, 2024Updated last year
- ☆35Jan 10, 2025Updated last year
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆34Nov 21, 2021Updated 4 years ago
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Oct 14, 2024Updated last year
- Concurrency library☆17Oct 13, 2024Updated last year
- Automatically take good care of your preemptible TPUs☆37May 15, 2023Updated 2 years ago
- Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".☆44Oct 29, 2021Updated 4 years ago
- MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer☆41Jun 7, 2022Updated 3 years ago
- Transformers at any scale☆42Jan 18, 2024Updated 2 years ago
- A JAX library for building lattice-based speech transducer models☆46Updated this week
- The official repository for the paper Multilingual Mathematical Autoformalization☆38May 20, 2024Updated last year
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago