ClashLuke / MinRETROLinks
Reimplementation of `Improving language models by retrieving from trillions of tokens`
β19Updated 3 years ago
Alternatives and similar repositories for MinRETRO
Users that are interested in MinRETRO are comparing it to the libraries listed below
Sorting:
- Understand and test language model architectures on synthetic tasks.β252Updated last month
- A MAD laboratory to improve AI architecture designs π§ͺβ138Updated last year
- Emergent world representations: Exploring a sequence model trained on a synthetic taskβ201Updated 2 years ago
- [NeurIPS 2023] Learning Transformer Programsβ162Updated last year
- β185Updated 2 years ago
- Utilities for the HuggingFace transformers libraryβ74Updated 3 years ago
- β132Updated 2 years ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"β248Updated 8 months ago
- Tools for understanding how transformer predictions are built layer-by-layerβ567Updated 6 months ago
- β78Updated 3 years ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paperβ135Updated 3 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β187Updated 3 weeks ago
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inferenceβ¦β216Updated 3 weeks ago
- Train very large language models in Jax.β210Updated 2 years ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayβ260Updated 2 years ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)β198Updated last year
- β367Updated last year
- Erasing concepts from neural representations with provable guaranteesβ243Updated last year
- Scaling Data-Constrained Language Modelsβ341Updated 7 months ago
- β77Updated last year
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPTβ224Updated last year
- A library for efficient patching and automatic circuit discovery.β88Updated last month
- Bootstrapping ARCβ155Updated last year
- nanoGPT-like codebase for LLM trainingβ113Updated 3 months ago
- β284Updated last year
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models β¦β241Updated 2 weeks ago
- Implementation of Bitune: Bidirectional Instruction-Tuningβ23Updated 7 months ago
- Sparse Autoencoder Training Libraryβ56Updated 9 months ago
- β208Updated 3 weeks ago
- β116Updated last year