cofe-ai/MSG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cofe-ai/MSG)

cofe-ai / MSG

Masked Structural Growth for 2x Faster Language Model Pre-training

☆25

Alternatives and similar repositories for MSG

Users that are interested in MSG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cofe-ai / nanoLM
View on GitHub
An Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
☆16Jun 6, 2024Updated 2 years ago
allenai / staged-training
View on GitHub
Staged Training for Transformer Language Models
☆33Mar 31, 2022Updated 4 years ago
cofe-ai / Sketch
View on GitHub
☆18Sep 5, 2024Updated last year
Nicolas-BZRD / llm-distillation
View on GitHub
☆11Feb 3, 2025Updated last year
thunlp / Knowledge-Inheritance
View on GitHub
Source code for paper: Knowledge Inheritance for Pre-trained Language Models
☆37Apr 24, 2022Updated 4 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Alignment-Lab-AI / datagen
View on GitHub
a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆32Sep 22, 2024Updated last year
bentherien / mu_learned_optimization
View on GitHub
[Poster; ICLR 2026] [Oral; Neurips OPT2024] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
☆16Apr 15, 2026Updated 3 months ago
kiyan-rezaee / Systematic-Literature-Review-on-Online-Continual-Learning
View on GitHub
☆14Jan 10, 2025Updated last year
cofe-ai / Mu-scaling
View on GitHub
Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
☆32Jul 17, 2023Updated 3 years ago
pstutz / syncodia
View on GitHub
Bridging Large Language Models with Scala 3 Functions
☆11Aug 31, 2024Updated last year
WalterSimoncini / no-train-all-gain
View on GitHub
Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"
☆11Oct 31, 2024Updated last year
zhangir-azerbayev / proof-pile
View on GitHub
Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.
☆22Nov 26, 2022Updated 3 years ago
neandertech / smithy4s-fetch
View on GitHub
Smithy4s client directly using Fetch APIs, without bringing http4s/cats, to dramatically reduce bundle size
☆13Jun 1, 2026Updated last month
bcmi220 / ggdp
View on GitHub
Global Greedy Dependency Parsing
☆10Mar 16, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
NeuroCompLab-psu / SpikingBERT
View on GitHub
This project contains code for the paper titled "SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentia…
☆28Feb 21, 2024Updated 2 years ago
YulinLi0 / minimum_scaling_free_region
View on GitHub
☆13Oct 22, 2024Updated last year
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Jul 15, 2026Updated last week
IDSIA / rtrl-elstm
View on GitHub
Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)
☆13Jun 11, 2025Updated last year
shenshuaijie / SDN
View on GitHub
The code of SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models
☆23Mar 25, 2026Updated 4 months ago
BICLab / MetaLA
View on GitHub
Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)
☆36Jan 18, 2025Updated last year
Hi-ZenanXu / Syntax-Enhanced_Pre-trained_Model
View on GitHub
Source Data of ACL2021 paper "Syntax-Enhanced Pre-trained Model"
☆11Jun 1, 2021Updated 5 years ago
BingSu12 / Log-Polar-Space-Convolution
View on GitHub
Log-Polar Space Convolution for Convolutional Neural Networks
☆13Dec 12, 2022Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
clulab / scala-transformers
View on GitHub
Scala interfaces to huggingface transformers and tokenizers
☆13Jul 14, 2026Updated last week
dilinwang820 / fast-energy-aware-splitting
View on GitHub
Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent
☆14Feb 6, 2020Updated 6 years ago
clulab / worldtree-api
View on GitHub
Explanation-centered inference for question answering
☆16Feb 7, 2018Updated 8 years ago
stzhang-patrick / ArcMMLU
View on GitHub
☆16Feb 2, 2024Updated 2 years ago
kanezaki / MIRO
View on GitHub
☆13Apr 16, 2018Updated 8 years ago
MajorDavidZhang / MCL
View on GitHub
code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
☆20Jul 16, 2024Updated 2 years ago
jphacks / SD_1702
View on GitHub
☆14Nov 21, 2017Updated 8 years ago
clulab / qup
View on GitHub
qup: a Single-Node Job Scheduler with NVIDIA GPU support
☆18Jan 10, 2023Updated 3 years ago
umd-mith / topic-modeling
View on GitHub
Topic modeling utilities
☆15Nov 14, 2013Updated 12 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
oja / aosumm
View on GitHub
Summarize a document conditioned on aspect keywords.
☆17Sep 7, 2022Updated 3 years ago
Nicolas-BZRD / llm-recipes
View on GitHub
☆33Mar 13, 2024Updated 2 years ago
mr-eggplant / adaxpert0
View on GitHub
Code for AdaXpert (ICML'21)
☆16Jul 19, 2021Updated 5 years ago
YiteWang / lemon-pytorch
View on GitHub
This is the unofficial implementation of LEMON (ICLR'2024).
☆13Apr 14, 2024Updated 2 years ago
luqui / dana
View on GitHub
Dana - a purely functional (virtual) operating system
☆16Jul 6, 2009Updated 17 years ago
hayaalsh / yolo_ros_vino
View on GitHub
YoloV3 + OpenVINO + ROS
☆11Feb 3, 2019Updated 7 years ago
nverma1 / merging-text-transformers
View on GitHub
Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year