dominiquegarmier / grok-pytorch
pytorch implementation of grok
☆12Updated 2 months ago
Alternatives and similar repositories for grok-pytorch:
Users that are interested in grok-pytorch are comparing it to the libraries listed below
- Cerule - A Tiny Mighty Vision Model☆67Updated 7 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Eh, simple and works.☆27Updated last year
- ☆22Updated last year
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆33Updated 2 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 8 months ago
- ☆27Updated 9 months ago
- ☆54Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Hugging Face Deep RL Class notes☆10Updated 2 years ago
- A synthetic story narration dataset to study small audio LMs.☆32Updated last year
- The Next Generation Multi-Modality Superintelligence☆71Updated 7 months ago
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated last month
- A sample pattern for running CI tests on Modal☆17Updated 2 weeks ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated last month
- Latent Large Language Models☆18Updated 8 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"☆16Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- Build Agentic workflows with function calling using open LLMs☆26Updated 2 weeks ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 2 months ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated 2 years ago
- ☆17Updated this week
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆96Updated last month
- Jax like function transformation engine but micro, microjax☆30Updated 6 months ago
- Collection of autoregressive model implementation☆85Updated this week
- Modified Beam Search with periodical restart☆12Updated 7 months ago
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Updated 5 months ago
- ☆10Updated 6 months ago