PyTorch Implementation of GPT-2
☆32Sep 4, 2024Updated last year
Alternatives and similar repositories for gpt2-from-scratch
Users that are interested in gpt2-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [KDD 2023] code for "Test accuracy vs. generalization gap: model selection in NLP without accessing training or testing data" https://arx…☆12Oct 17, 2022Updated 3 years ago
- ☆14Feb 5, 2025Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- Interpretating the latent space representations of attention head outputs for LLMs☆39Aug 13, 2024Updated last year
- Elixir: Train a Large Language Model on a Small GPU Cluster☆15Jun 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆23Mar 4, 2026Updated 3 weeks ago
- 45+ production-ready tutorials on data science, MLOps, and AI tools. All code is executable and adaptable for real projects.☆23Mar 10, 2026Updated 2 weeks ago
- ☆17Apr 9, 2025Updated 11 months ago
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆18Aug 30, 2024Updated last year
- (NeurIPS 2024) One-shot Federated Learning via Synthetic Distiller-Distillate Communication☆18Mar 11, 2025Updated last year
- NSFW detection and annotator application for images. Detects and segments only nudity for now.☆15Mar 4, 2026Updated 3 weeks ago
- Genarris is a random molecular crystal structure generator.☆30Updated this week
- ☆14Jun 25, 2024Updated last year
- This after-effects script helps users to build composition structure for twixtor effect over one or more layers with only a single click,…☆13Mar 20, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections☆21Oct 15, 2024Updated last year
- Web application to use VidGear Stabilizer easily☆14Feb 15, 2021Updated 5 years ago
- Multi-agent investing agent using Claude Agent SDK☆13Oct 3, 2025Updated 5 months ago
- 百度地图坐标拾取工具☆12Jan 27, 2018Updated 8 years ago
- Official implementation of Humans as Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos (IC…☆19Oct 21, 2025Updated 5 months ago
- ☆29Dec 15, 2025Updated 3 months ago
- A machine-readable constitution for AI — Soul’s creativity hardened into ÆON☆16Sep 14, 2025Updated 6 months ago
- An AlphaZero engine for Saiblo Connect4, featuring a pure Python implementation of key KataGo techniques.☆16Feb 26, 2026Updated last month
- This repo has scripts to compare various powerful RL methods☆41Feb 23, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Deep Learning for Energy Efficient Beamforming in MU-MISO Networks: A GAT-based Approach☆15Apr 22, 2023Updated 2 years ago
- Reactive, Eloquent-inspired ORM for Dart & Flutter designed for PowerSync, SQLite or whatever you want.☆23Feb 12, 2026Updated last month
- A self-made NeurIPS poster template, infused with the unique design style of ShanghaiTech.☆15Dec 26, 2023Updated 2 years ago
- Error monitor for Spring Boot☆16Nov 26, 2021Updated 4 years ago
- HUHEMS is a full-stack exam management system for Haramaya University. It supports admin-managed exams and question banks, student exam a…☆35Mar 13, 2026Updated 2 weeks ago
- Code for "Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shift". (NeurIPS 24)☆19Apr 21, 2025Updated 11 months ago
- CarlaDataCollector is a lightweight framework for efficient data collection in Carla simulation environment.☆14Jan 9, 2024Updated 2 years ago
- ☆11Jan 24, 2025Updated last year
- Token-efficient autonomous task execution with context collapse for pi coding agent☆103Mar 19, 2026Updated last week
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- This is AlpaGasus2-QLoRA based on LLaMA2 with AlpaGasus mechanism using QLoRA!☆15Nov 22, 2023Updated 2 years ago
- This repository is dedicated to diving into the world of machine learning through daily projects, tutorials, and insights.☆14Nov 19, 2024Updated last year
- [ICML 2025] Official PyTorch implementation of "NegMerge: Sign-Consensual Weight Merging for Machine Unlearning"☆15Nov 25, 2025Updated 4 months ago
- Some common CUDA kernel implementations (Not the fastest).☆29Dec 5, 2025Updated 3 months ago
- Golang standards Of Fundamental Astronomy☆19Dec 12, 2023Updated 2 years ago
- OllamaFX is a native, lightweight, and professional JavaFX desktop client for Ollama. Run Llama 3, Mistral, and Phi-3 locally with maximu…☆64Mar 6, 2026Updated 3 weeks ago
- A machine learning solution for extracting key entity values (weight, volume, dimensions) from product images.☆18Sep 17, 2024Updated last year