google-deepmind / transformer_ngramsLinks
☆33Updated last year
Alternatives and similar repositories for transformer_ngrams
Users that are interested in transformer_ngrams are comparing it to the libraries listed below
Sorting:
- ☆143Updated 2 months ago
- Training-Ready RL Environments + Evals☆177Updated this week
- Open source interpretability artefacts for R1.☆163Updated 7 months ago
- ☆104Updated 3 months ago
- Evaluation of LLMs on latest math competitions☆180Updated last month
- Our solution for the arc challenge 2024☆185Updated 5 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆196Updated 5 months ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Updated 7 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆255Updated last week
- ☆140Updated this week
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆290Updated this week
- ☆46Updated 7 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆72Updated 7 months ago
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆104Updated last month
- Stochastic Parameter Decomposition☆51Updated last week
- ☆157Updated 3 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆173Updated 4 months ago
- rl from zero pretrain, can it be done? yes.☆280Updated last month
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆111Updated last month
- ☆106Updated last month
- code for training & evaluating Contextual Document Embedding models☆200Updated 6 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆60Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆130Updated 3 years ago
- ☆211Updated last week
- Open-source framework for the research and development of foundation models.☆611Updated last week
- Simple & Scalable Pretraining for Neural Architecture Research☆300Updated 3 weeks ago
- Dion optimizer algorithm☆384Updated this week
- ☆478Updated 4 months ago
- ☆28Updated last month
- ☆37Updated 9 months ago