DOUDOU0314 / GPT-J-hfLinks
GPT-jax based on the official huggingface library
☆13Updated 4 years ago
Alternatives and similar repositories for GPT-J-hf
Users that are interested in GPT-J-hf are comparing it to the libraries listed below
Sorting:
- ☆32Updated 2 years ago
- TorchServe+Streamlit for easily serving your HuggingFace NER models☆33Updated 2 years ago
- Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE☆18Updated 3 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.☆32Updated 3 years ago
- ☆11Updated 4 years ago
- ☆37Updated 2 years ago
- ☆12Updated 6 months ago
- A library for squeakily cleaning and filtering language datasets.☆47Updated last year
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆10Updated 3 years ago
- ☆15Updated 3 years ago
- Ranking of fine-tuned HF models as base models.☆35Updated last month
- hllama is a library which aims to provide a set of utility tools for large language models.☆10Updated last year
- Implementation of stop sequencer for Huggingface Transformers☆16Updated 2 years ago
- ☆28Updated 2 years ago
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆35Updated 3 years ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆36Updated 2 years ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated 2 months ago
- Megatron LM 11B on Huggingface Transformers☆27Updated 3 years ago
- ↔️ T5 Machine Translation from English to Korean☆18Updated 2 years ago
- Calculating Expected Time for training LLM.☆38Updated 2 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆50Updated 3 years ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆39Updated last year
- Describe the format of image/text datasets☆11Updated 3 years ago
- **ARCHIVED** Filesystem interface to 🤗 Hub☆58Updated 2 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Transformer based Trigram Blocking implementation in Tensorflow☆11Updated 5 years ago
- ☆11Updated 2 years ago