google-deepmind / transformer_ngramsLinks
☆31Updated 11 months ago
Alternatives and similar repositories for transformer_ngrams
Users that are interested in transformer_ngrams are comparing it to the libraries listed below
Sorting:
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆194Updated 5 months ago
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆104Updated last month
- ☆197Updated 2 months ago
- Open source interpretability artefacts for R1.☆163Updated 6 months ago
- Training-Ready RL Environments + Evals☆158Updated this week
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆102Updated last month
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆170Updated 4 months ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆193Updated last year
- ☆142Updated last month
- 🧱 Modula software package☆299Updated 2 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆72Updated 6 months ago
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆281Updated this week
- Our solution for the arc challenge 2024☆182Updated 4 months ago
- rl from zero pretrain, can it be done? yes.☆279Updated last month
- ☆106Updated last week
- Simple & Scalable Pretraining for Neural Architecture Research☆298Updated this week
- 📄Small Batch Size Training for Language Models☆63Updated 3 weeks ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆109Updated 3 weeks ago
- Dion optimizer algorithm☆374Updated last month
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆21Updated 3 months ago
- Evaluation of LLMs on latest math competitions☆175Updated last week
- ☆149Updated 2 months ago
- a Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization in pure C.☆22Updated last year
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- Open-source framework for the research and development of foundation models.☆574Updated this week
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Updated 6 months ago
- Implementation of SOAR☆42Updated last month
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆300Updated 2 months ago
- ☆502Updated 5 months ago