nisten/grokadamw

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nisten/grokadamw)

nisten / grokadamw

new optimizer

☆20

Alternatives and similar repositories for grokadamw

Users that are interested in grokadamw are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

QuixiAI / grokadamw
View on GitHub
☆137Aug 19, 2024Updated last year
vivek3141 / ghostbuster-data
View on GitHub
Data from the paper "Ghostbuster: Detecting Text Ghostwritten by Large Language Models"
☆14May 27, 2024Updated 2 years ago
CLAIRE-Labo / flash_attention
View on GitHub
A basic pure pytorch implementation of flash attention
☆17Oct 28, 2024Updated last year
TrentBrick / SDMContinualLearner
View on GitHub
☆21Mar 1, 2023Updated 3 years ago
Da1sypetals / cuda-Wavelet-KAN
View on GitHub
CUDA implementation of Wavelet KAN.
☆17Jun 8, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Sanster / padding_free_llm_train
View on GitHub
☆16Feb 6, 2024Updated 2 years ago
secemp9 / rubrics
View on GitHub
a bunch of rubrics I made in different format and structure for llm judge and other use cases
☆16Sep 22, 2025Updated 10 months ago
alonsosilvaallende / knowledge-graph-generator
View on GitHub
Knowledge Graph Generator app
☆35Apr 18, 2024Updated 2 years ago
kuterd / opal_ptx
View on GitHub
Experimental GPU language with meta-programming
☆31Sep 6, 2024Updated last year
Helw150 / levanter
View on GitHub
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆16Jun 16, 2024Updated 2 years ago
geohot / babygrad
View on GitHub
One File Tensor Libraries
☆31Oct 7, 2025Updated 9 months ago
IST-DASLab / MicroAdam
View on GitHub
This repository contains code for the MicroAdam paper.
☆21Dec 14, 2024Updated last year
lucidrains / PEER-pytorch
View on GitHub
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆137Nov 1, 2025Updated 8 months ago
vikhyat / e_natten
View on GitHub
Blazingly fast neighborhood attention
☆15Nov 28, 2023Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
technicolor-research / quick-adc
View on GitHub
Quick ADC
☆27May 31, 2019Updated 7 years ago
philippe-eecs / vitok
View on GitHub
☆34May 14, 2025Updated last year
Ag2S1 / Sibyl-System
View on GitHub
☆125Aug 13, 2024Updated last year
IST-DASLab / SparseFinetuning
View on GitHub
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆43Jan 15, 2024Updated 2 years ago
IST-DASLab / qmoe
View on GitHub
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆277Nov 3, 2023Updated 2 years ago
nanowell / Q-Sparse-LLM
View on GitHub
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆37Aug 14, 2024Updated last year
DLYuanGod / ViT-1.58b
View on GitHub
☆20Jul 5, 2024Updated 2 years ago
giangdip2410 / HyperRouter
View on GitHub
Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"
☆33Nov 29, 2023Updated 2 years ago
MichaelEinhorn / trl-textworld
View on GitHub
☆13May 7, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ScalingIntelligence / CATS
View on GitHub
☆33Nov 11, 2024Updated last year
josephsdavid / ChinesePM
View on GitHub
Extensive time series analysis of chinese PM2.5 content, using models from ARMA and VAR to LSTMs and dynamic time warping clustering
☆12Aug 17, 2019Updated 6 years ago
laxyapahuja / domicontacts
View on GitHub
sync google contacts with information from the dominos data breach <3
☆11May 24, 2021Updated 5 years ago
BorealisAI / flora-opt
View on GitHub
This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.
☆108Jul 1, 2024Updated 2 years ago
ptrblck / DiscoGAN
View on GitHub
Webcam demo for SKTBrain/DiscoGAN
☆12Sep 11, 2019Updated 6 years ago
xiangze / SDIP
View on GitHub
Stable diffusion dedicated Hardware with multiple pipelined processor cores
☆14Apr 9, 2026Updated 3 months ago
Aleph-Alpha-Research / trigrams
View on GitHub
☆60Nov 18, 2025Updated 8 months ago
nyonicai / nyonic-public
View on GitHub
Reference implementation of models from Nyonic Model Factory
☆12May 13, 2024Updated 2 years ago
AnswerDotAI / exhash
View on GitHub
Verified Line-Addressed File Editor
☆16Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 18, 2026Updated 4 months ago
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year
mixedbread-ai / batched
View on GitHub
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆161Jul 14, 2025Updated last year
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
multi-swe-bench / MSWE-agent
View on GitHub
☆16Apr 2, 2025Updated last year
cloneofsimo / zeroshampoo
View on GitHub
☆33Sep 10, 2024Updated last year
apple / ml-np-rasp
View on GitHub
☆22Jan 19, 2024Updated 2 years ago