QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning PaLM with only five examples per language. We use the synthetic data to finetune downstream QA models leading to improved accuracy in comparison to English-only and translation-based baselines.
☆34Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for QAmeleon
Users that are interested in QAmeleon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Gzip and nearest neighbors for text classification☆57Aug 1, 2023Updated 2 years ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Apr 18, 2023Updated 3 years ago
- A collection of utilities for handling IPA phones.☆27Sep 24, 2023Updated 2 years ago
- Ukrainian ELECTRA model☆12Mar 11, 2023Updated 3 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 🚢 Data Toolkit for Sailor Language Models☆95Feb 24, 2025Updated last year
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- ☆57Apr 18, 2026Updated last month
- The paper list of multilingual pre-trained models (Continual Updated).☆24Jun 18, 2024Updated last year
- A library for language transfer methods and algorithms.☆16Feb 6, 2026Updated 4 months ago
- Library of models for Protein Function prediction (part of the 18th top solution out of 1625 teams in CAFA5)☆20May 23, 2025Updated last year
- suffix array construction and searching algorithms for in-memory binary data.☆12Sep 10, 2022Updated 3 years ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆36Jun 5, 2026Updated last week
- ☆10Oct 17, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A part-of-speech tagger with support for domain adaptation and external resources.☆24Oct 26, 2022Updated 3 years ago
- Submission archive for the MS MARCO passage ranking leaderboard☆13Apr 21, 2023Updated 3 years ago
- ☆37Nov 14, 2025Updated 7 months ago
- Deep memory and sequence models in JAX☆30Jun 8, 2026Updated last week
- Non Metric Space ( Approximate ) Library in R☆12Feb 2, 2023Updated 3 years ago
- Official repository of the paper MPMQA: Multimodal Question Answering on Product Manuals (AAAI 2023)☆21Nov 28, 2022Updated 3 years ago
- Library for experimenting with state-of-the-art evaluation metrics like UScore☆12May 27, 2023Updated 3 years ago
- EWoK dataset generation framework☆14May 14, 2024Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆15Nov 20, 2025Updated 6 months ago
- ☆24Oct 23, 2020Updated 5 years ago
- https://arxiv.org/abs/2404.10917☆14Mar 18, 2025Updated last year
- Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies regard classification and bias mitigation triggers.☆16Sep 25, 2024Updated last year
- Few-shot Learning with Auxiliary Data☆31Dec 8, 2023Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆81Apr 11, 2024Updated 2 years ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆27Sep 10, 2024Updated last year
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆22Mar 23, 2026Updated 2 months ago
- A powerful text cleaner for Japanese web texts☆12Jan 20, 2024Updated 2 years ago
- SIGIR 2023 tutorial on cross language information retrieval.☆13Feb 28, 2024Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Nov 13, 2023Updated 2 years ago
- A framework to train language models to learn invariant representations.☆14Jan 24, 2022Updated 4 years ago
- Scaling Sparse Fine-Tuning to Large Language Models☆19Jan 31, 2024Updated 2 years ago
- Rust binding to crfsuite☆25Jan 31, 2026Updated 4 months ago