cornstarch-org / CornstarchLinks

☆106

Alternatives and similar repositories for Cornstarch

Users that are interested in Cornstarch are comparing it to the libraries listed below

Sorting:

facebookresearch / LayerSkip
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
☆341Updated 5 months ago
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆296Updated last month
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆251Updated this week
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆201Updated last year
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆342Updated 10 months ago
eqimp / hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
☆125Updated last month
huggingface / gpt-oss-recipes
Collection of scripts and notebooks for OpenAI's latest GPT OSS models
☆455Updated last month
cray-lm / cray-lm
Cray-LM unified training and inference stack.
☆22Updated 8 months ago
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆270Updated this week
huggingface / picotron_tutorial
☆222Updated last week
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆172Updated 8 months ago
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆290Updated 2 months ago
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆196Updated 7 months ago
DeepAuto-AI / hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
☆148Updated this week
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆161Updated 5 months ago
fangyuan-ksgk / Tiny-GRPO
minimal GRPO implementation from scratch
☆98Updated 6 months ago
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆120Updated 8 months ago
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆390Updated 3 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆90Updated 4 months ago
snowflakedb / ArcticTraining
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
☆224Updated this week
huggingface / kernels
Load compute kernels from the Hub
☆293Updated last week
NVlabs / hymba
☆199Updated 10 months ago
writer / writing-in-the-margins
☆119Updated last year
tilde-research / MoMoE-impl
Memory optimized Mixture of Experts
☆68Updated 2 months ago
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆240Updated 11 months ago
Pints-AI / 1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
☆330Updated 6 months ago
huggingface / kernel-builder
👷 Build compute kernels
☆155Updated this week
unslothai / unsloth-zoo
Utils for Unsloth https://github.com/unslothai/unsloth
☆153Updated last week
QuixiAI / grokadamw
☆136Updated last year
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆109Updated 5 months ago