fattorib / Little-GPT
GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!
☆23Updated 2 years ago
Alternatives and similar repositories for Little-GPT:
Users that are interested in Little-GPT are comparing it to the libraries listed below
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- One stop shop for all things carp☆59Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated last year
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated last year
- Experimental sampler to make LLMs more creative☆30Updated last year
- GoldFinch and other hybrid transformer components☆43Updated 6 months ago
- ☆31Updated 8 months ago
- A library for squeakily cleaning and filtering language datasets.☆46Updated last year
- ☆32Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Merge LLM that are split in to parts☆26Updated last year
- RWKV model implementation☆37Updated last year
- Create soft prompts for fairseq 13B dense, GPT-J-6B and GPT-Neo-2.7B for free in a Google Colab TPU instance☆27Updated last year
- ☆44Updated 8 months ago
- Efficiently computing & storing token n-grams from large corpora☆18Updated 4 months ago
- An open-source replication and extension of the Meta AI's LLAMA dataset☆24Updated last year
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 3 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- Minimum Description Length probing for neural network representations☆18Updated 3 weeks ago
- ☆74Updated last year
- Code repository for the c-BTM paper☆105Updated last year
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆32Updated 4 months ago
- ☆44Updated 3 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 10 months ago
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆35Updated 3 years ago
- Public Inflection Benchmarks☆69Updated 11 months ago
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆30Updated 8 months ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- ☆26Updated 11 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated last week