fattorib / Little-GPT
GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!
☆23Updated last year
Related projects: ⓘ
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆62Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆23Updated last year
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆15Updated 3 months ago
- Measuring and Controlling Persona Drift in Language Model Dialogs☆11Updated 6 months ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated 9 months ago
- One stop shop for all things carp☆58Updated 2 years ago
- Efficiently computing & storing token n-grams from large corpora☆15Updated 2 weeks ago
- ☆55Updated 9 months ago
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- An open-source replication and extension of the Meta AI's LLAMA dataset☆24Updated last year
- ☆29Updated 2 weeks ago
- ☆32Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- ☆30Updated 4 months ago
- Experimental sampler to make LLMs more creative☆29Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- Based on the tree of thoughts paper☆45Updated last year
- LLMs as Collaboratively Edited Knowledge Bases☆40Updated 7 months ago
- GoldFinch and other hybrid transformer components☆38Updated 2 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆60Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆25Updated 6 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆19Updated 2 months ago
- Latent Large Language Models☆16Updated 3 weeks ago
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- [COLM '24] Source-Aware Training Enables Knowledge Attribution in Language Models☆13Updated last month
- ☆44Updated 2 months ago
- ☆50Updated last month
- RWKV model implementation☆38Updated last year