Rojak-NLP / LLM-Code-MixingLinks
Can LLMs generate code-mixed sentences through zero-shot prompting?
☆11Updated 2 years ago
Alternatives and similar repositories for LLM-Code-Mixing
Users that are interested in LLM-Code-Mixing are comparing it to the libraries listed below
Sorting:
- ☆29Updated 3 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering☆16Updated 2 years ago
- An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation☆27Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Updated last year
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆54Updated 2 years ago
- ☆46Updated 3 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Updated 10 months ago
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- Task Compass: Scaling Multi-task Pre-training with Task Prefix (EMNLP 2022: Findings) (stay tuned & more will be updated)☆22Updated 2 years ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆57Updated 2 years ago
- EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560☆58Updated 3 months ago
- TBC☆27Updated 2 years ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆16Updated 2 years ago
- Code for Massive-scale Decoding for Text Generation using Lattices☆44Updated 2 years ago
- PyTorch code for "FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization" (NAACL 2022)☆39Updated 2 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆48Updated 3 years ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…☆46Updated 2 years ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16Updated 3 years ago
- Data and code for ACL 2023 paper XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations☆10Updated 2 years ago
- ☆15Updated 3 years ago
- Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval☆25Updated last year
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆27Updated 3 years ago
- [ICLR 2023] PyTorch code of Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees☆24Updated 2 years ago
- Code and data associated with the AmbiEnt dataset in "We're Afraid Language Models Aren't Modeling Ambiguity" (Liu et al., 2023)☆63Updated last year
- PyTorch reimplementation of REALM and ORQA☆22Updated 3 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Updated 3 years ago
- ☆38Updated 11 months ago