Norod / TrainGPT2-127M-FromScratchLinks
A trio of Google-Colab notebooks (ipynb) for training a GPT-2 (127M) model from scratch (useful for other / non-English languages) using gpt-2-simple
☆16Updated 5 years ago
Alternatives and similar repositories for TrainGPT2-127M-FromScratch
Users that are interested in TrainGPT2-127M-FromScratch are comparing it to the libraries listed below
Sorting:
- ☆33Updated 2 years ago
- ☆44Updated 3 years ago
- One stop shop for all things carp☆59Updated 3 years ago
- Latent Diffusion Language Models☆70Updated 2 years ago
- Experimental sampler to make LLMs more creative☆31Updated 2 years ago
- [WIP] A 🔥 interface for running code in the cloud☆86Updated 2 years ago
- Smol but mighty language model☆63Updated 2 years ago
- GPT-jax based on the official huggingface library☆13Updated 4 years ago
- Finetune any model on HF in less than 30 seconds☆56Updated 2 months ago
- Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA☆104Updated 7 months ago
- Using short models to classify long texts☆21Updated 2 years ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated 2 years ago
- ☆27Updated 2 years ago
- Merge LLM that are split in to parts☆27Updated 5 months ago
- 🎨 Imagine what Picasso could have done with AI. Self-host your StableDiffusion API.☆50Updated 2 years ago
- ☆51Updated 2 years ago
- ☆64Updated 2 years ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Updated 2 years ago
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Updated 2 years ago
- ☆30Updated 4 years ago
- The Next Generation Multi-Modality Superintelligence☆70Updated last year
- Tune MPTs☆84Updated 2 years ago
- Implementation of the Mamba SSM with hf_integration.☆56Updated last year
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated 2 years ago
- QLoRA for Masked Language Modeling☆22Updated 2 years ago
- Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentati…☆41Updated 2 years ago
- Experiments with generating opensource language model assistants☆97Updated 2 years ago
- Load any clip model with a standardized interface☆22Updated 2 months ago
- Command-line script for inferencing from models such as LLaMA, in a chat scenario, with LoRA adaptations☆33Updated 2 years ago
- Thispersondoesnotexist went down, so this time, while building it back up, I am going to open source all of it.☆91Updated 2 years ago