code to train a gpt-2 model to train it on tiny stories dataset according to the TinyStories paper
☆40Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for TinyStories
Users that are interested in TinyStories are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated last year
- see github.com/understanding-search/maze-transformer☆10Dec 8, 2023Updated 2 years ago
- A Mathematica and Matlab toolboxes for Clifford algebras of n-dimensional Euclidean vector spaces☆11Jan 24, 2018Updated 8 years ago
- 《차근차근 실습하며 배우는 파이토치 딥러닝 프로그래밍》 예제 코드☆23Aug 17, 2022Updated 3 years ago
- customizable template GPT code designed for easy novel architecture experimentation☆26Mar 19, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆15May 20, 2023Updated 2 years ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆38Jun 6, 2023Updated 2 years ago
- Automatic subordinate clause extractor☆11Jul 7, 2022Updated 3 years ago
- ☆12Aug 19, 2023Updated 2 years ago
- The official Languini Kitchen repository☆14May 6, 2024Updated last year
- Core Engine of Singing Voice Conversion & Singing Voice Clone☆15Jul 15, 2023Updated 2 years ago
- ☆11Sep 25, 2025Updated 6 months ago
- ☆12Jun 22, 2024Updated last year
- [ACL 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"☆17Apr 3, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- ☆34May 28, 2023Updated 2 years ago
- [ICML'24] Conformal Prediction for Deep Classifier via Label Ranking☆13Jun 14, 2024Updated last year
- Synthetic Hypertext and Homomorphic Catalogue☆15Dec 28, 2024Updated last year
- Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding (RSS 2021)☆12Oct 22, 2021Updated 4 years ago
- Computation of binomial confidence intervals that achieve exact coverage.☆14Apr 23, 2025Updated 11 months ago
- Conversion script adapting vicuna dataset into alpaca format for use with oobabooga's trainer☆13Jun 21, 2023Updated 2 years ago
- Super learning of conditional survival functions with right-censored time-to-event outcomes in discrete or continuous time.☆15Dec 9, 2024Updated last year
- Uncertainty quantification for in-context learning of large language models☆15Apr 1, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for "Transformer-Based Deep Survival Analysis"☆12May 27, 2022Updated 3 years ago
- ☆13Nov 1, 2023Updated 2 years ago
- a WIP architecture designed to allow transformers to think in a manner without tokens☆20Apr 12, 2024Updated 2 years ago
- ☆14Jul 24, 2024Updated last year
- Launch a full-fledged D&D 5e text adventure in seconds. Generate a unique, procedurally crafted world—complete with kingdoms, guilds, and…☆27Apr 9, 2026Updated last week
- Patch for MPT-7B which allows using and training a LoRA☆58May 20, 2023Updated 2 years ago
- ☆12Apr 3, 2024Updated 2 years ago
- This repository showcases how to use the DynamixelSDK C++ and Python APIs to control an Interbotix XSeries Arm.☆17Apr 28, 2021Updated 4 years ago
- Pytorch routines for (Ker)nel (Mac)hines☆12Oct 10, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation of Diffusion Policy☆13Dec 13, 2024Updated last year
- Hercules: Attributable and Scalable Opinion Summarization (ACL 2023)☆20Nov 8, 2023Updated 2 years ago
- VersaPlayer Extension to enable airplay functionality☆11Mar 2, 2020Updated 6 years ago
- Simple MoE - Day 17 of 365 Days of Repos☆18Jan 17, 2025Updated last year
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- ☆19Sep 24, 2024Updated last year
- Curiosity in Multi-Step Motion Planning☆13Jul 15, 2020Updated 5 years ago