google-research-datasets / QAmeleonLinks
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning PaLM with only five examples per language. We use the synthetic data to finetune downstream QA models leading to improved accuracy in comparison to English-only and translation-based baselines.
☆34Updated last year
Alternatives and similar repositories for QAmeleon
Users that are interested in QAmeleon are comparing it to the libraries listed below
Sorting:
- Embedding Recycling for Language models☆38Updated last year
- A library for squeakily cleaning and filtering language datasets.☆47Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated 2 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆27Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- ☆44Updated 7 months ago
- My explorations into editing the knowledge and memories of an attention network☆35Updated 2 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Updated 5 months ago
- Using short models to classify long texts☆21Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- QLoRA for Masked Language Modeling☆22Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆58Updated last month
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- ☆38Updated last year
- PyTorch implementation for MRL☆18Updated last year
- ☆47Updated 4 months ago
- Experiments with generating opensource language model assistants☆97Updated 2 years ago
- ☆23Updated last year
- Easily run PyTorch on multiple GPUs & machines☆46Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- ☆14Updated 8 months ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 3 years ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated 2 years ago
- Minimum Description Length probing for neural network representations☆18Updated 4 months ago
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆59Updated 3 years ago
- ☆20Updated last year
- ☆12Updated 6 months ago