QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning PaLM with only five examples per language. We use the synthetic data to finetune downstream QA models leading to improved accuracy in comparison to English-only and translation-based baselines.
☆34Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for QAmeleon
Users that are interested in QAmeleon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated last year
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Apr 18, 2023Updated 2 years ago
- A collection of utilities for handling IPA phones.☆27Sep 24, 2023Updated 2 years ago
- Ukrainian ELECTRA model☆12Mar 11, 2023Updated 3 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- 🚢 Data Toolkit for Sailor Language Models☆96Feb 24, 2025Updated last year
- ☆54Mar 31, 2026Updated 2 weeks ago
- The paper list of multilingual pre-trained models (Continual Updated).☆24Jun 18, 2024Updated last year
- A library for language transfer methods and algorithms.☆16Feb 6, 2026Updated 2 months ago
- suffix array construction and searching algorithms for in-memory binary data.☆12Sep 10, 2022Updated 3 years ago
- From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks☆15Feb 23, 2023Updated 3 years ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆36Mar 23, 2026Updated 3 weeks ago
- Python module to remove wiki markup text.☆10Jan 15, 2016Updated 10 years ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆89Feb 27, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆21Nov 20, 2020Updated 5 years ago
- ☆10Oct 17, 2021Updated 4 years ago
- ☆58Nov 5, 2024Updated last year
- A part-of-speech tagger with support for domain adaptation and external resources.☆24Oct 26, 2022Updated 3 years ago
- ☆12Apr 1, 2026Updated 2 weeks ago
- Submission archive for the MS MARCO passage ranking leaderboard☆13Apr 21, 2023Updated 2 years ago
- Manifests list for a multi-arch Docker image☆11Jan 23, 2019Updated 7 years ago
- Non Metric Space ( Approximate ) Library in R☆12Feb 2, 2023Updated 3 years ago
- Deep memory and sequence models in JAX☆25Jan 15, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official repository of the paper MPMQA: Multimodal Question Answering on Product Manuals (AAAI 2023)☆19Nov 28, 2022Updated 3 years ago
- 基于中心度的中文关键短语抽取工具☆11Sep 2, 2022Updated 3 years ago
- EWoK dataset generation framework☆11May 14, 2024Updated last year
- ☆12Jul 6, 2023Updated 2 years ago
- ☆12Dec 13, 2022Updated 3 years ago
- ☆24Oct 23, 2020Updated 5 years ago
- ☆11Jun 19, 2022Updated 3 years ago
- https://arxiv.org/abs/2404.10917☆14Mar 18, 2025Updated last year
- LEMON: Explainable Entity Matching☆19Apr 6, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies regard classification and bias mitigation triggers.☆16Sep 25, 2024Updated last year
- Few-shot Learning with Auxiliary Data☆31Dec 8, 2023Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆82Apr 11, 2024Updated 2 years ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆27Sep 10, 2024Updated last year
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- ✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models☆38Oct 1, 2025Updated 6 months ago