RobertRiachi / nanoPALM
☆143Updated last year
Alternatives and similar repositories for nanoPALM:
Users that are interested in nanoPALM are comparing it to the libraries listed below
- An interactive exploration of Transformer programming.☆261Updated last year
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆343Updated 7 months ago
- Simple Transformer in Jax☆136Updated 9 months ago
- ☆92Updated last year
- Simple embedding -> text model trained on a small subset of Wikipedia sentences.☆153Updated last year
- Helpers and such for working with Lambda Cloud☆51Updated last year
- ☆153Updated 2 years ago
- The history files when recording human interaction while solving ARC tasks☆97Updated this week
- Full finetuning of large language models without large memory requirements☆93Updated last year
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆201Updated 4 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- A puzzle to learn about prompting☆124Updated last year
- ☆22Updated last year
- A really tiny autograd engine☆90Updated 11 months ago
- run paligemma in real time☆131Updated 10 months ago
- Simplex Random Feature attention, in PyTorch☆74Updated last year
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆168Updated this week
- ☆214Updated 8 months ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated last year
- Train very large language models in Jax.☆203Updated last year
- ☆166Updated 2 years ago
- AI sends pull requests for features you request in natural language☆113Updated last year
- git extension for {collaborative, communal, continual} model development☆208Updated 4 months ago
- a small code base for training large models☆288Updated 3 months ago
- Drive a browser with Cohere☆72Updated last year
- A collection of LLM services you can self host via docker or modal labs to support your applications development☆186Updated 10 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆186Updated 9 months ago
- Drop in replacement for OpenAI, but with Open models.☆153Updated last year
- Automatic gradient descent☆207Updated last year