erfanzar / OST-OpenSourceTransformersLinks

OST Collection: An AI-powered suite of models that predict the next word matches with remarkable accuracy (Text Generative Models). OST Collection is based on a novel approach to work as a full and intelligent NLP Model.

☆15

Alternatives and similar repositories for OST-OpenSourceTransformers

Users that are interested in OST-OpenSourceTransformers are comparing it to the libraries listed below

Sorting:

erfanzar / eformer
(EasyDel Former) is a utility library designed to simplify and enhance the development in JAX
☆28Updated 2 weeks ago
Instinct-AI / Xerxes
Xerxes, a highly advanced Persian AI assistant developed by InstinctAI, a cutting-edge AI startup. primary function is to assist users wi…
☆11Updated last year
erfanzar / EasyDeL
Accelerate, Optimize performance with streamlined training and serving options with JAX.
☆320Updated this week
erfanzar / InstinctiveDiffuse
A cutting-edge text-to-image generator model that leverages state-of-the-art Stable Diffusion Model Type to produce high-quality, realist…
☆13Updated last year
erfanzar / Calute
Agents for intelligence and coordination
☆21Updated last week
yixiaoer / tpu-training-example
☆15Updated last year
erfanzar / jax-flash-attn2
A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/…
☆29Updated 8 months ago
davisyoshida / abnormal-floats
Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)
☆21Updated 2 years ago
yixiaoer / tpux
A set of Python scripts that makes your experience on TPU better
☆54Updated last month
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆52Updated 2 years ago
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆215Updated last year
KyujinHan / Sakura-SOLAR-DPO
Sakura-SOLAR-DPO: Merge, SFT, and DPO
☆116Updated last year
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆86Updated 3 years ago
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆87Updated last year
kjslag / spacebyte
A byte-level decoder architecture that matches the performance of tokenized Transformers.
☆66Updated last year
huyphan168 / PEER
Mixture of A Million Experts
☆48Updated last year
epfml / DenseFormer
☆81Updated last year
lessw2020 / transformer_central
Various transformers for FSDP research
☆38Updated 2 years ago
EleutherAI / training-jacobian
☆22Updated 10 months ago
proger / hippogriff
Griffin MQA + Hawk Linear RNN Hybrid
☆89Updated last year
geronimi73 / phi2-finetune
☆88Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆167Updated 9 months ago
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆119Updated last year
cloneofsimo / fim-llama-deepspeed
☆32Updated last year
dvruette / barrel-rec-pytorch
☆53Updated last year
SmerkyG / RWKV_Explained
RWKV, in easy to read code
☆72Updated 7 months ago
huu4ontocord / MDEL
Multi-Domain Expert Learning
☆66Updated last year
cat-state / tinypar
☆20Updated 2 years ago
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆202Updated last year