fpaboim / tinysparse
A fork of tinygrad made to work with sparse tensors. Sparse neural networks are here!
☆11Updated 3 years ago
Alternatives and similar repositories for tinysparse
Users that are interested in tinysparse are comparing it to the libraries listed below
Sorting:
- Latent Large Language Models☆18Updated 8 months ago
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…☆42Updated last year
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆44Updated 11 months ago
- ☆49Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 11 months ago
- ☆38Updated 9 months ago
- Tree-based indexes for neural-search☆31Updated last year
- NLP with Rust for Python 🦀🐍☆62Updated this week
- FastFeedForward Networks☆20Updated last year
- ☆60Updated 3 years ago
- ☆27Updated 10 months ago
- Utilities for Training Very Large Models☆58Updated 7 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆60Updated 3 years ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated last month
- RWKV model implementation☆37Updated last year
- Implementation of Spectral State Space Models☆16Updated last year
- Jax like function transformation engine but micro, microjax☆32Updated 6 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 10 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated 10 months ago
- Latent Diffusion Language Models☆68Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and visi…☆23Updated 3 months ago
- ☆22Updated last year
- A stateful pytree library for training neural networks.☆22Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆78Updated last year
- ☆19Updated last month
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆53Updated last year
- QLoRA with Enhanced Multi GPU Support☆37Updated last year