erfanzar / OST-OpenSourceTransformersLinks
OST Collection: An AI-powered suite of models that predict the next word matches with remarkable accuracy (Text Generative Models). OST Collection is based on a novel approach to work as a full and intelligent NLP Model.
☆15Updated 2 years ago
Alternatives and similar repositories for OST-OpenSourceTransformers
Users that are interested in OST-OpenSourceTransformers are comparing it to the libraries listed below
Sorting:
- Xerxes, a highly advanced Persian AI assistant developed by InstinctAI, a cutting-edge AI startup. primary function is to assist users wi…☆11Updated last year
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆29Updated last month
- A cutting-edge text-to-image generator model that leverages state-of-the-art Stable Diffusion Model Type to produce high-quality, realist…☆13Updated last year
- Agents for intelligence and coordination☆21Updated this week
- Accelerate, Optimize performance with streamlined training and serving options with JAX.☆328Updated this week
- Implementation of the Llama architecture with RLHF + Q-learning☆168Updated 11 months ago
- A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/…☆33Updated 10 months ago
- RWKV, in easy to read code☆72Updated 9 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆38Updated 6 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆102Updated last year
- JAX implementation of the Llama 2 model☆215Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆116Updated 2 years ago
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Updated 2 years ago
- Supercharge huggingface transformers with model parallelism.☆77Updated 5 months ago
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆36Updated last year
- ☆95Updated 2 years ago
- Prune transformer layers☆74Updated last year
- Collection of autoregressive model implementation☆85Updated 8 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Updated 5 months ago
- ☆51Updated last year
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Updated last year
- ☆137Updated last year
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆66Updated last year
- ☆82Updated last year
- ☆45Updated 2 years ago
- Implementation of the BitLinear layer from: The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits☆13Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- A set of Python scripts that makes your experience on TPU better☆55Updated 3 months ago
- Mixture of A Million Experts☆52Updated last year
- ☆16Updated last year