clankur / einygpt
a transformer implemented primarily using einops and trained on the tinystories dataset
☆12Updated 2 months ago
Related projects: ⓘ
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated 9 months ago
- Efficiently computing & storing token n-grams from large corpora☆15Updated 2 weeks ago
- GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!☆23Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆62Updated last year
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆31Updated last year
- A file utility for accessing both local and remote files through a unified interface.☆36Updated last month
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆15Updated 3 months ago
- [COLM '24] Source-Aware Training Enables Knowledge Attribution in Language Models☆13Updated last month
- Discord bot that generates messages using GPT-2☆20Updated 5 years ago
- Documentation effort for the BookCorpus dataset☆30Updated 3 years ago
- Jupyter Notebooks and an R Notebook for encoding Pokémon embeddings and creating data visualizations.☆16Updated 2 months ago
- Code accompanying the paper "A Language Model's Guide Through Latent Space". It contains functionality for training and using concept vec…☆16Updated 6 months ago
- Experimental sampler to make LLMs more creative☆29Updated last year
- Fast inference of Instruct tuned LLaMa on your personal devices.☆22Updated last year
- JAX implementations of RWKV☆18Updated 11 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆25Updated 11 months ago
- ☆32Updated last year
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆19Updated last month
- Understanding how features learned by neural networks evolve throughout training☆30Updated this week
- implementation of https://arxiv.org/pdf/2312.09299☆19Updated 2 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA☆35Updated 6 months ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆18Updated last year
- Latent Large Language Models☆16Updated 3 weeks ago
- PyTorch library for synthesizing programs from natural language☆18Updated last month
- Rust bindings for CTranslate2☆13Updated last year
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- Run embedding models using ONNX☆23Updated 7 months ago