erfanzar / OST-OpenSourceTransformersLinks
OST Collection: An AI-powered suite of models that predict the next word matches with remarkable accuracy (Text Generative Models). OST Collection is based on a novel approach to work as a full and intelligent NLP Model.
☆15Updated last year
Alternatives and similar repositories for OST-OpenSourceTransformers
Users that are interested in OST-OpenSourceTransformers are comparing it to the libraries listed below
Sorting:
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆28Updated last week
- Xerxes, a highly advanced Persian AI assistant developed by InstinctAI, a cutting-edge AI startup. primary function is to assist users wi…☆11Updated last year
- Accelerate, Optimize performance with streamlined training and serving options with JAX.☆292Updated this week
- A cutting-edge text-to-image generator model that leverages state-of-the-art Stable Diffusion Model Type to produce high-quality, realist…☆13Updated last year
- A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/…☆24Updated 4 months ago
- some common Huggingface transformers in maximal update parametrization (µP)☆82Updated 3 years ago
- ☆31Updated last year
- ☆81Updated last year
- RWKV, in easy to read code☆72Updated 4 months ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆99Updated last week
- JAX implementation of the Llama 2 model☆219Updated last year
- A set of Python scripts that makes your experience on TPU better☆55Updated last year
- ☆67Updated 2 years ago
- Prune transformer layers☆69Updated last year
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆85Updated last year
- Mixture of A Million Experts☆46Updated 11 months ago
- ESM2 protein language models in JAX/Flax☆17Updated 2 years ago
- Google TPU optimizations for transformers models☆117Updated 6 months ago
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Updated last year
- ☆20Updated 2 years ago
- ☆36Updated last year
- Collection of autoregressive model implementation☆86Updated 3 months ago
- ☆34Updated 5 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆59Updated 3 years ago
- Modeling code for a BitNet b1.58 Llama-style model.☆25Updated last year
- This is a fork of SGLang for hip-attention integration. Please refer to hip-attention for detail.☆15Updated this week
- ☆38Updated 2 months ago
- ☆53Updated 9 months ago
- Inference code for LLaMA models in JAX☆118Updated last year