areu01or00 / Tensor-SlayerLinks
Tensor-Slayer : Manipulate weights and tensors of LLMs to achieve performance upgrades and introduce a novel inferenceless mechanistic interpretability
☆17Updated 4 months ago
Alternatives and similar repositories for Tensor-Slayer
Users that are interested in Tensor-Slayer are comparing it to the libraries listed below
Sorting:
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 8 months ago
- ☆28Updated last year
- look how they massacred my boy☆63Updated 11 months ago
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- ☆35Updated 2 months ago
- ☆46Updated last year
- ☆68Updated 4 months ago
- ☆40Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Updated 2 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆56Updated last week
- Simple GRPO scripts and configurations.☆59Updated 8 months ago
- ☆54Updated 10 months ago
- ☆25Updated 4 months ago
- Simple repository for training small reasoning models☆40Updated 7 months ago
- Latent Large Language Models☆19Updated last year
- An introduction to LLM Sampling☆79Updated 9 months ago
- Jax like function transformation engine but micro, microjax☆32Updated 11 months ago
- ☆62Updated last year
- Project code for training LLMs to write better unit tests + code☆21Updated 4 months ago
- LLM training in simple, raw C/CUDA☆15Updated 10 months ago
- ☆88Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆31Updated last year
- train with kittens!☆62Updated 11 months ago
- ☆67Updated last year
- Verbosity control for AI agents☆65Updated last year
- Training code for Sparse Autoencoders on Embedding models☆38Updated 7 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆65Updated this week
- ☆28Updated 3 months ago
- Collection of autoregressive model implementation☆86Updated 5 months ago
- Simplex Random Feature attention, in PyTorch☆74Updated last year