tensorops / TransformerX
Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow β
, Pytorch π, and Jax π)
β53Updated last year
Alternatives and similar repositories for TransformerX
Users that are interested in TransformerX are comparing it to the libraries listed below
Sorting:
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β99Updated 4 months ago
- Outlining techniques for improving the training performance of your PyTorch model without compromising its accuracyβ129Updated 2 years ago
- Toy genetic algorithm in Pytorchβ49Updated 2 weeks ago
- Pytorch (Lightning) implementation of the Mamba modelβ28Updated 3 weeks ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of newβ¦β121Updated 9 months ago
- Testing KAN-based text generation GPT modelsβ17Updated last year
- This repository contains a better implementation of Kolmogorov-Arnold networksβ61Updated last year
- β31Updated 10 months ago
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch πβ122Updated last month
- RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systemsβ104Updated 3 months ago
- Training small GPT-2 style models using Kolmogorov-Arnold networks.β117Updated 11 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIMβ54Updated last year
- Highly commented implementations of Transformers in PyTorchβ136Updated last year
- β134Updated last year
- Generate graph/data embeddings multiple waysβ52Updated this week
- β81Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.β47Updated last year
- Named Entity Recognition with an decoder-only (autoregressive) LLM using HuggingFaceβ42Updated 6 months ago
- This is the code that went into our practical dive using mamba as information extractionβ54Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog poβ¦β92Updated last year
- Pytorch implementation of a simple way to enable (Stochastic) Frame Averaging for any networkβ50Updated 9 months ago
- Complete implementation of Llama2 with/without KV cache & inference πβ46Updated 11 months ago
- β91Updated last month
- A set of of fundamental operations and deep learning models using JAXβ13Updated 4 years ago
- Set of scripts to finetune LLMsβ37Updated last year
- Implementation of Agent Attention in Pytorchβ89Updated 10 months ago
- Gzip and nearest neighbors for text classificationβ57Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-expertsβ118Updated 7 months ago
- Use Grounding DINO, Segment Anything, and CLIP to label objects in images.β31Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmindβ62Updated 8 months ago