ENOT-AutoDL / gpt-j-6B-tensorrt-int8Links
GPT-J 6B inference on TensorRT with INT-8 precision
☆11Updated 2 years ago
Alternatives and similar repositories for gpt-j-6B-tensorrt-int8
Users that are interested in gpt-j-6B-tensorrt-int8 are comparing it to the libraries listed below
Sorting:
- My explorations into editing the knowledge and memories of an attention network☆35Updated 2 years ago
- Checkpointable dataset utilities for foundation model training☆32Updated last year
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆27Updated last year
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- sigma-MoE layer☆18Updated last year
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆58Updated 3 years ago
- Adversarial Training and SFT for Bot Safety Models☆40Updated 2 years ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Updated 6 months ago
- ☆20Updated last year
- Index of URLs to pdf files all over the internet and scripts☆23Updated 2 years ago
- This repository contains example code to build models on TPUs☆30Updated 2 years ago
- Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops☆28Updated last year
- Observe the slow deterioration of my mental sanity in the github commit history☆12Updated 2 years ago
- Memory-efficient transformer. Work in progress.☆19Updated 2 years ago
- ☆13Updated 6 years ago
- Running massive simulations using RNNs on CPUs for building bots and all kinds of things.☆14Updated 3 years ago
- High performance pytorch modules☆18Updated 2 years ago
- ☆39Updated 2 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)☆116Updated 3 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆50Updated 3 years ago
- Implements the SM3-II adaptive optimization algorithm for PyTorch.☆33Updated 9 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- GPT-jax based on the official huggingface library☆13Updated 3 years ago
- Various transformers for FSDP research☆37Updated 2 years ago
- [JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion☆40Updated 4 years ago
- TorchServe+Streamlit for easily serving your HuggingFace NER models☆33Updated 2 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆74Updated 2 years ago
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆27Updated 2 years ago
- ☆26Updated 2 years ago