ngoyal2707 / Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆18Updated 2 years ago
Alternatives and similar repositories for Megatron-LM:
Users that are interested in Megatron-LM are comparing it to the libraries listed below
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- ☆11Updated 2 years ago
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 9 months ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated last year
- Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)☆10Updated last week
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Updated 3 years ago
- Compute-optimal LLMs☆11Updated 2 years ago
- ☆16Updated 11 months ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- ☆15Updated 6 months ago
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆29Updated 2 years ago
- GPT-jax based on the official huggingface library☆13Updated 3 years ago
- Common crawl pretrained sentencepiece tokenizers for English and Japanese for various vocabulary sizes. Also development environment for …☆10Updated 3 years ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated 2 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- Code for "CyberWallE at SemEval-2020 Task 11: An Analysis of Feature Engineering for Ensemble Models for Propaganda Detection" (V. Blasch…☆9Updated 4 years ago
- Efficiently computing & storing token n-grams from large corpora☆23Updated 7 months ago
- Robust and Fast tokenizations alignment library for Rust and Python https://tamuhey.github.io/tokenizations/☆29Updated 3 years ago
- Minimum Description Length probing for neural network representations☆19Updated 3 months ago
- ☆13Updated 5 months ago
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated last year
- Hugging Face and Pyserini interoperability☆20Updated last year
- Training a model without a dataset for natural language inference (NLI)☆25Updated 4 years ago
- ☆22Updated 3 months ago
- Using short models to classify long texts☆21Updated 2 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆39Updated last year
- Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statement…☆16Updated 3 years ago