kyo-takano/chinchilla

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kyo-takano/chinchilla)

kyo-takano / chinchilla

A toolkit for scaling law research ⚖

☆69

Alternatives and similar repositories for chinchilla

Users that are interested in chinchilla are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shehper / scaling_laws
View on GitHub
An open-source implementation of Scaling Laws for Neural Language Models using nanoGPT
☆55Dec 8, 2023Updated 2 years ago
Phylliida / MambaLens
View on GitHub
Mamba support for transformer lens
☆20Sep 17, 2024Updated last year
aredden / torch-cublas-hgemm
View on GitHub
PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu
☆78Dec 3, 2024Updated last year
locuslab / scaling_laws_data_filtering
View on GitHub
☆64Apr 9, 2024Updated 2 years ago
crowsonkb / torch-dist-utils
View on GitHub
Utilities for PyTorch distributed
☆26Feb 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
koayon / atp_star
View on GitHub
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Jan 19, 2025Updated last year
cloneofsimo / scaling-guide
View on GitHub
WIP
☆96Aug 13, 2024Updated last year
kimbochen / md-blogs
View on GitHub
A blog where I write about research papers and blog posts I read.
☆12Nov 20, 2024Updated last year
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
GindaChen / FlexFlashAttention3
View on GitHub
FlexAttention w/ FlashAttention3 Support
☆27Oct 5, 2024Updated last year
AISaturdaysLagos / cohort7_classes
View on GitHub
This repository houses materials consulted by the instructors
☆13Jan 8, 2022Updated 4 years ago
Birch-san / booru-embed
View on GitHub
[WIP] Transformer to embed Danbooru labelsets
☆13Mar 31, 2024Updated 2 years ago
Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 9 months ago
thecharlieblake / lovely-llama
View on GitHub
An implementation of the Llama architecture, to instruct and delight
☆21May 31, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Chillee / lit-llama
View on GitHub
Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code
☆10Aug 29, 2023Updated 2 years ago
kyegomez / MobileVLM
View on GitHub
Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …
☆15Mar 11, 2024Updated 2 years ago
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
EleutherAI / mdl
View on GitHub
Minimum Description Length probing for neural network representations
☆20Jan 28, 2025Updated last year
cloneofsimo / min-max-gpt
View on GitHub
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Apr 17, 2024Updated 2 years ago
ryoungj / ObsScaling
View on GitHub
[NeurIPS'24 Spotlight] Observational Scaling Laws
☆60Oct 2, 2024Updated last year
pytorch / torchdistx
View on GitHub
Torch Distributed Experimental
☆117Aug 5, 2024Updated last year
sekstini / basedxl
View on GitHub
☆18Mar 18, 2024Updated 2 years ago
Edward-Sun / gpt-accelera
View on GitHub
Simple and efficient pytorch-native transformer training and inference (batched)
☆78Apr 2, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Zyphra / tree_attention
View on GitHub
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆134Dec 3, 2024Updated last year
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
ptillet / triton-llvm-releases
View on GitHub
☆20Oct 11, 2023Updated 2 years ago
vedantpalit / Towards-Vision-Language-Mechanistic-Interpretability
View on GitHub
This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…
☆25Feb 16, 2026Updated 5 months ago
psorianom / modified_adsorption
View on GitHub
Python implementation of the random-walk inductive classification algorithm Modified Adsorption from P. Talukdar
☆15Jul 30, 2014Updated 11 years ago
graphcore-research / unit-scaling
View on GitHub
A library for unit scaling in PyTorch
☆135Jul 11, 2025Updated last year
VITA-Group / SSM-Bottleneck
View on GitHub
[ICLR'25] "Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing" by Peihao Wang, Ruisi Cai, Yue…
☆18Mar 21, 2025Updated last year
MatX-inc / seqax
View on GitHub
seqax = sequence modeling + JAX
☆195Jul 23, 2025Updated last year
BobMcDear / attorch
View on GitHub
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆605May 13, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HazyResearch / embroid
View on GitHub
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
☆11Aug 12, 2023Updated 2 years ago
JonasGeiping / linear_cross_entropy_loss
View on GitHub
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
☆75Aug 2, 2024Updated last year
er537 / whisper_interpretability
View on GitHub
A repo to do interpretability of pre-trained acoustic models
☆15Oct 15, 2023Updated 2 years ago
srush / Tensor-Puzzles-Penzai
View on GitHub
☆22Apr 22, 2024Updated 2 years ago
irhum / hyena
View on GitHub
JAX/Flax implementation of the Hyena Hierarchy
☆35Apr 27, 2023Updated 3 years ago
HazyResearch / prefix-linear-attention
View on GitHub
☆62Jul 9, 2024Updated 2 years ago
alxndrTL / othello_mamba
View on GitHub
Evaluating the Mamba architecture on the Othello game
☆49Apr 25, 2024Updated 2 years ago