Masked Structural Growth for 2x Faster Language Model Pre-training
☆25Apr 28, 2024Updated 2 years ago
Alternatives and similar repositories for MSG
Users that are interested in MSG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales☆16Jun 6, 2024Updated 2 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 4 years ago
- Open Source Implementation of Dual Modality MAGVIT2 Tokenizer☆25Nov 26, 2024Updated last year
- FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.☆69May 15, 2026Updated 3 weeks ago
- ☆11Feb 3, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆37Apr 24, 2022Updated 4 years ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Sep 22, 2024Updated last year
- [Poster; ICLR 2026] [Oral; Neurips OPT2024] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆16Apr 15, 2026Updated last month
- Implementation of MixCE method described in ACL 2023 paper by Zhang et al.☆20May 29, 2023Updated 3 years ago
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆22Nov 26, 2022Updated 3 years ago
- Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions☆15May 7, 2018Updated 8 years ago
- Scala interfaces to huggingface transformers and tokenizers☆13Apr 27, 2026Updated last month
- [ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennige…☆92Feb 26, 2024Updated 2 years ago
- This project contains code for the paper titled "SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentia…☆28Feb 21, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Log-Polar Space Convolution for Convolutional Neural Networks☆13Dec 12, 2022Updated 3 years ago
- Explanation-centered inference for question answering☆16Feb 7, 2018Updated 8 years ago
- Fork of Flame repo for training of some new stuff in development☆19Jun 1, 2026Updated last week
- Source Data of ACL2021 paper "Syntax-Enhanced Pre-trained Model"☆11Jun 1, 2021Updated 5 years ago
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆36Jan 18, 2025Updated last year
- [Neural Networks] SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation☆30Apr 11, 2025Updated last year
- Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent☆14Feb 6, 2020Updated 6 years ago
- Official PyTorch implementation of our CVPR 2025 paper, "LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning."☆18Mar 28, 2025Updated last year
- Block-Recurrent Dynamics in ViTs 🦖☆45May 21, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆43Oct 13, 2023Updated 2 years ago
- ☆13Apr 16, 2018Updated 8 years ago
- NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning☆29Jul 28, 2024Updated last year
- Generic build server☆65May 25, 2014Updated 12 years ago
- qup: a Single-Node Job Scheduler with NVIDIA GPU support☆18Jan 10, 2023Updated 3 years ago
- ☆36Aug 23, 2023Updated 2 years ago
- ☆17Feb 2, 2024Updated 2 years ago
- ☆14Nov 21, 2017Updated 8 years ago
- This is the unofficial implementation of LEMON (ICLR'2024).☆13Apr 14, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆33Mar 13, 2024Updated 2 years ago
- Code for AdaXpert (ICML'21)☆16Jul 19, 2021Updated 4 years ago
- YoloV3 + OpenVINO + ROS☆11Feb 3, 2019Updated 7 years ago
- An experiment to see if chatgpt can improve the output of the stanford alpaca dataset☆12Mar 29, 2023Updated 3 years ago
- A sample app to debug and validate cellular modems on balena devices☆13Jun 5, 2019Updated 7 years ago
- ☆22Aug 27, 2023Updated 2 years ago
- ☆13May 17, 2025Updated last year