A pytorch realization of adafactor (https://arxiv.org/pdf/1804.04235.pdf )
☆26Aug 27, 2019Updated 6 years ago
Alternatives and similar repositories for adafactor-pytorch
Users that are interested in adafactor-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for Episodic Memory Reader (EMR) https://arxiv.org/abs/1903.06164☆15Nov 16, 2022Updated 3 years ago
- Official Code Repository for the paper "Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation …☆20Jun 19, 2023Updated 2 years ago
- Sancho McCann's PhD Thesis Research Code☆25Oct 12, 2017Updated 8 years ago
- utilities for tensorflow2.x.x☆15Jul 19, 2023Updated 2 years ago
- ☆13May 4, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".☆17Dec 15, 2021Updated 4 years ago
- ☆23Oct 20, 2020Updated 5 years ago
- Transformers for Cost-Sensitive BERT for Generalisable Sentence Classification on Imbalanced Data☆18May 28, 2020Updated 5 years ago
- MultiLabel classification of cow diseases by text and symptoms recognition (NER)☆12Aug 13, 2022Updated 3 years ago
- A tool to help adjust or zero-out Flux Block Weights and SAVE. I'm not a dev, so this implementation might be wrong.☆29Nov 20, 2024Updated last year
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆39Nov 4, 2025Updated 6 months ago
- Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization (IEEE TPAMI 2021)☆17Jun 4, 2021Updated 4 years ago
- Stereoscopic 3D toolkit for ComfyUI combining depth-based stereo generation with GPU acceleration, native VR viewing via PyOpenXR, and AI…☆47May 20, 2026Updated last week
- Variance Covariance Regularization☆14Jun 22, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Stable Diffusion PNGINFO Beautify extension☆31Oct 9, 2025Updated 7 months ago
- pytorch实现bert做seq2seq任务,使用unilm方案。☆10Apr 1, 2020Updated 6 years ago
- The Stream-51 dataset for streaming classification and novelty detection from videos.☆17Feb 22, 2022Updated 4 years ago
- A Continual Learning Library in PyTorch and JAX☆13Apr 18, 2023Updated 3 years ago
- ☆18May 5, 2023Updated 3 years ago
- (ICML 2023) Feature learning in deep classifiers through Intermediate Neural Collapse: Accompanying code☆16Jul 27, 2023Updated 2 years ago
- Code for "Self-Distillation as Instance-Specific Label Smoothing"☆15Oct 22, 2020Updated 5 years ago
- I modified some code of K-BERT so that it can be fit to English datasets Topics Resources☆11Dec 15, 2022Updated 3 years ago
- PyTorch code for our CoLLAs-2022 paper "Online Continual Learning for Embedded Devices"☆12Aug 4, 2022Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- Extend the Conditioning of Stable Diffusion to take Audio Embeddings Instead of Text Embeddings using Wav2Vec2-BERT model☆13Sep 25, 2024Updated last year
- ☆10Oct 8, 2018Updated 7 years ago
- A simple modification on the official DETR codebase with support to Finetune on custom dataset☆14Nov 26, 2020Updated 5 years ago
- CoreXY conversion for the Folgertech FT-5 printer☆15Feb 20, 2024Updated 2 years ago
- 基于中文的营销文本生成,基于Pointer Generator Network和Converage的实现,此外还尝试各种文本数据增广和优化技巧☆18Sep 5, 2020Updated 5 years ago
- LGEB: Benchmark of Language Generation Evaluation☆16Oct 21, 2022Updated 3 years ago
- A place to house minutes and other documents related to the core team.☆13Dec 16, 2020Updated 5 years ago
- ☆11Oct 5, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [NeurIPS 2024] Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling☆27Jul 10, 2025Updated 10 months ago
- ☆34Apr 22, 2025Updated last year
- ☆10Sep 16, 2020Updated 5 years ago
- ☆17Jan 19, 2026Updated 4 months ago
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆11Jul 25, 2023Updated 2 years ago
- PyTorch helper code☆10Dec 20, 2018Updated 7 years ago
- This repository contains example code to build models on TPUs☆30Feb 17, 2023Updated 3 years ago