Aleph-Alpha / trigrams
☆50Updated 5 months ago
Alternatives and similar repositories for trigrams:
Users that are interested in trigrams are comparing it to the libraries listed below
- ☆47Updated 5 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆23Updated 4 months ago
- ☆70Updated 5 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆25Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- ☆53Updated last year
- ☆48Updated 2 months ago
- A repository for research on medium sized language models.☆76Updated 8 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- ☆31Updated 7 months ago
- Code for Zero-Shot Tokenizer Transfer☆121Updated 2 weeks ago
- ☆41Updated last year
- Training code for Sparse Autoencoders on Embedding models☆35Updated 2 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆56Updated 2 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 10 months ago
- ☆98Updated last week
- PyTorch building blocks for OLMo☆49Updated this week
- Train, tune, and infer Bamba model☆80Updated 2 weeks ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- My fork os allen AI's OLMo for educational purposes.☆30Updated last month
- ☆49Updated 10 months ago
- Collection of autoregressive model implementation☆77Updated 3 weeks ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆67Updated 3 months ago
- Experiments for efforts to train a new and improved t5☆77Updated 9 months ago
- LLM training in simple, raw C/CUDA☆14Updated last month
- ☆118Updated last week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆91Updated 2 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆79Updated 10 months ago
- gzip Predicts Data-dependent Scaling Laws☆33Updated 8 months ago