DOUDOU0314 / GPT-J-hf
GPT-jax based on the official huggingface library
☆13Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for GPT-J-hf
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆11Updated 3 years ago
- Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE☆18Updated 3 years ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated 11 months ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- ☆11Updated 4 years ago
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- TorchServe+Streamlit for easily serving your HuggingFace NER models☆31Updated 2 years ago
- ☆23Updated last year
- Using short models to classify long texts☆20Updated last year
- PyTorch implementation of GLOM☆21Updated 2 years ago
- ☆17Updated last year
- exBERT on Transformers🤗☆10Updated 3 years ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆38Updated 9 months ago
- Few Shot Learning using EleutherAI's GPT-Neo an Open-source version of GPT-3☆18Updated 3 years ago
- ↔️ T5 Machine Translation from English to Korean☆17Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- Convenient Text-to-Text Training for Transformers☆19Updated 2 years ago
- Implementation of stop sequencer for Huggingface Transformers☆15Updated last year
- Megatron LM 11B on Huggingface Transformers☆27Updated 3 years ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated last month
- Code for running the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT☆16Updated last year
- ☆9Updated 3 months ago
- ☆16Updated 2 years ago
- ☆32Updated last year
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆47Updated 2 years ago
- Training a model without a dataset for natural language inference (NLI)☆25Updated 4 years ago
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆34Updated 3 years ago