shikaiqiu/compute-better-spent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shikaiqiu/compute-better-spent)

shikaiqiu / compute-better-spent

☆63

Alternatives and similar repositories for compute-better-spent

Users that are interested in compute-better-spent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AndPotap / einsum-search
View on GitHub
☆34Oct 4, 2024Updated last year
thomasahle / cce
View on GitHub
Clustered Compositional Embeddings
☆13Oct 25, 2023Updated 2 years ago
wtong98 / mlp-icl
View on GitHub
☆12Sep 16, 2024Updated last year
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
google-deepmind / spectral_ssm
View on GitHub
☆35Apr 12, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
wesg52 / llm-context-neurons
View on GitHub
Find context neurons in Pythia models.
☆13Jun 13, 2023Updated 3 years ago
tobna / TaylorShift
View on GitHub
This repository contains the code for the paper "TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back)…
☆15Feb 25, 2026Updated 5 months ago
VITA-Group / WeLore
View on GitHub
[ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications
☆52Oct 30, 2025Updated 8 months ago
mfinzi / neural-ivp
View on GitHub
☆11May 12, 2023Updated 3 years ago
recursal / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆46Jul 20, 2024Updated 2 years ago
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
expz / annotated-hyena
View on GitHub
An annotated implementation of the Hyena Hierarchy paper
☆34May 28, 2023Updated 3 years ago
epfml / schedules-and-scaling
View on GitHub
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆93Oct 30, 2024Updated last year
bentherien / mu_learned_optimization
View on GitHub
[Poster; ICLR 2026] [Oral; Neurips OPT2024] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
☆16Apr 15, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
HazyResearch / m2
View on GitHub
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
☆564Dec 28, 2024Updated last year
proger / accelerated-scan
View on GitHub
Accelerated First Order Parallel Associative Scan
☆198Jan 7, 2026Updated 6 months ago
ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
Bond1995 / Markov
View on GitHub
Code for experiments on transformers using Markovian data.
☆22Nov 22, 2024Updated last year
mohsinulkabir14 / DEPTWEET
View on GitHub
This repository contains the dataset 'DEPTWEET' published in the journal of Computers in Human Behavior.
☆12Jul 12, 2023Updated 3 years ago
HazyResearch / train-tk
View on GitHub
train with kittens!
☆67Oct 25, 2024Updated last year
wilson-labs / cola
View on GitHub
Compositional Linear Algebra
☆517Aug 1, 2025Updated 11 months ago
cloneofsimo / minDinoV2
View on GitHub
☆24Oct 15, 2024Updated last year
teddysmithdev / angular-course
View on GitHub
☆10Aug 16, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
swairshah / Intensify
View on GitHub
coloring terminal text with intensities (used for plotting probability, entropy with tokens)
☆12Oct 11, 2024Updated last year
nkalyanv99 / UNI-D2
View on GitHub
☆54Jul 8, 2026Updated 3 weeks ago
Doraemonzzz / nanoTransNormer
View on GitHub
☆11Oct 11, 2023Updated 2 years ago
aradha / lin-RFM
View on GitHub
Code for lin-RFM used for sparse recovery tasks
☆17Mar 13, 2025Updated last year
kuleshov-group / proseco
View on GitHub
Learn from Your Mistakes: Self-Correcting Masked Diffusion Models
☆16Jun 25, 2026Updated last month
google-deepmind / asyncdiloco
View on GitHub
☆51Jan 18, 2024Updated 2 years ago
zqOuO / GWT
View on GitHub
☆13May 4, 2026Updated 2 months ago
GindaChen / FlexFlashAttention3
View on GitHub
FlexAttention w/ FlashAttention3 Support
☆27Oct 5, 2024Updated last year
johnryan465 / pscan
View on GitHub
☆40Jan 5, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
sustcsonglin / flash-linear-rnn
View on GitHub
Implementations of various linear RNN layers using pytorch and triton
☆55Aug 4, 2023Updated 2 years ago
Arongil / lipschitz-transformers
View on GitHub
Don't just regulate gradients like in Muon, regulate the weights too
☆32Jul 30, 2025Updated 11 months ago
CLAIRE-Labo / StructuredFFN
View on GitHub
The official code of "Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers"
☆20Jul 24, 2024Updated 2 years ago
lucidrains / grokfast-pytorch
View on GitHub
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆104Dec 22, 2024Updated last year
UW-Madison-Lee-Lab / SFT-PG
View on GitHub
Code for "Optimizing DDPM Sampling with Shortcut Fine-Tuning" (https://arxiv.org/abs/2301.13362), ICML 2023
☆30Oct 6, 2023Updated 2 years ago
30stomercury / hmm-backprop
View on GitHub
Fast and differentiable hidden Markov model in C++
☆19Jan 20, 2023Updated 3 years ago
okn-yu / Visualizing-the-Loss-Landscape-of-Neural-Nets
View on GitHub
☆19Jul 15, 2020Updated 6 years ago