An open-source implementation of Scaling Laws for Neural Language Models using nanoGPT
☆52Dec 8, 2023Updated 2 years ago
Alternatives and similar repositories for scaling_laws
Users that are interested in scaling_laws are comparing it to the libraries listed below
Sorting:
- An Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales☆16Jun 6, 2024Updated last year
- A toolkit for scaling law research ⚖☆57Jan 27, 2025Updated last year
- Interpretable Diffusion Via Information Decomposition☆29Jul 18, 2024Updated last year
- ☆34Sep 10, 2024Updated last year
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆36Jun 7, 2024Updated last year
- I use various Data Science and machine learning techniques to analyze customer data using STP framework. I preprocessed the data, perform…☆12Apr 26, 2020Updated 5 years ago
- openapi of all third-party☆10Updated this week
- An algorithm that intelligently executes a crypto order over time via Coinbase☆12Oct 26, 2021Updated 4 years ago
- A distributed data flow and computation system that runs on transactional messaging infrastructure☆11Oct 22, 2022Updated 3 years ago
- Build Your Own Neural Network Design☆14Aug 3, 2020Updated 5 years ago
- This is an interactive mock-up of the SpaceX Dragon 2 spacecraft's user interface. It contains 5 panels and multiple amusing features. A …☆11Jul 28, 2022Updated 3 years ago
- AI model for making mazes that extends OpenAIs GPT2 model☆15Dec 21, 2023Updated 2 years ago
- A+つくばは大学の課題を効率よく十分な品質で提出することができない (A+が取れない!!)問題を解決したい 同じ講義に知り合いが少ない筑波大生向けの筑波大生専用の匿名学習支援SNSです。☆11Nov 23, 2025Updated 3 months ago
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- scrape, clean and model IPO data with supervised ML☆10Aug 20, 2020Updated 5 years ago
- ☆14Jan 23, 2026Updated last month
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- PyTorch implementation of Project To Adapt (ACCV20 - Oral - Best Student Paper Award & IJCV 2022)☆10Jan 30, 2023Updated 3 years ago
- ☆11Jun 4, 2021Updated 4 years ago
- [NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting☆72Jan 9, 2026Updated 2 months ago
- yet another reinforcement learning package☆12May 24, 2022Updated 3 years ago
- Layered distributions using FLAX/JAX☆10Dec 13, 2020Updated 5 years ago
- ☆48Jan 18, 2024Updated 2 years ago
- Improving transparency of large language models' reasoning☆14Nov 25, 2025Updated 3 months ago
- [ICLR 2025 SSI-FM] Self-Taught Self-Correction for Small Language Models☆11Sep 19, 2025Updated 5 months ago
- JAX/Haiku implementation of "Auction Learning as a Two-Player Game"☆11Jul 6, 2024Updated last year
- MATLAB implementation of the universal directed information estimators in Jiantao Jiao, Haim H. Permuter, Lei Zhao, Young-Han Kim, and Ts…☆11Apr 2, 2019Updated 6 years ago
- Everything you need to reproduce "Better plain ViT baselines for ImageNet-1k" in PyTorch, and more☆12Updated this week
- Gym implementation of connector to Deepmind lab☆12Mar 26, 2019Updated 6 years ago
- A library to create lore plots (logistic regression of the prevalence of a categorical variable in function of a continuous feature)☆18Mar 1, 2026Updated last week
- A simple multicohort LTV calculator for subscriptions☆11Mar 7, 2023Updated 3 years ago
- Code for the paper "Optimal Off-Policy Evaluation from Multiple Logging Policies"☆15Jul 17, 2021Updated 4 years ago
- Official codebase for our paper "Do Language Models Use Their Depth Efficiently?"☆29Jun 25, 2025Updated 8 months ago
- Offline Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits☆10Oct 21, 2024Updated last year
- The official baseline implementations for Chronocept☆10Dec 21, 2025Updated 2 months ago
- Multi-Objective Causal Bayesian Optimisation, a new paradigm for finding Pareto-optimal interventions in multi-outcome causal models☆16Jun 2, 2025Updated 9 months ago
- Evolutionary Search for expert-level performance on any task with environmental feedback☆14Oct 12, 2025Updated 4 months ago
- A simple script to add pdf-files to Zotero via CLI☆12May 17, 2020Updated 5 years ago
- Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models☆11Jan 23, 2024Updated 2 years ago