CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training
☆32Jul 20, 2022Updated 3 years ago
Alternatives and similar repositories for CCQA
Users that are interested in CCQA are comparing it to the libraries listed below
Sorting:
- Source code of CIKM2021 Paper 'Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need'☆16Aug 30, 2021Updated 4 years ago
- Show the time in Roman Numerals☆11Jan 23, 2020Updated 6 years ago
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Feb 10, 2023Updated 3 years ago
- CIKM 2021 Full Paper: FedMatch: Federated Learning Over Heterogeneous Question Answering Data☆12Dec 14, 2021Updated 4 years ago
- ☆18Jun 10, 2022Updated 3 years ago
- TREC QA dataset for question answering cleaned for usage in Question Answering☆14Aug 26, 2019Updated 6 years ago
- ☆13Dec 11, 2021Updated 4 years ago
- Synthetic Data Generation for Evaluation☆13Feb 21, 2025Updated last year
- codebase for the SIMAT dataset and evaluation☆38Feb 16, 2022Updated 4 years ago
- ☆14May 31, 2022Updated 3 years ago
- Generalised UDRL☆37May 12, 2022Updated 3 years ago
- Scalable training for dense retrieval models.☆298Jun 10, 2025Updated 8 months ago
- Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference☆45Nov 28, 2022Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆74Mar 2, 2024Updated 2 years ago
- ☆46Apr 13, 2022Updated 3 years ago
- A repository for experiments in quality-aware decoding☆18Jun 7, 2022Updated 3 years ago
- Paranoid Transformer for NaNoGenMo☆19Nov 1, 2020Updated 5 years ago
- Code for the Ask4Help project☆22Nov 24, 2022Updated 3 years ago
- Measuring if attention is explanation with ROAR☆22Mar 3, 2023Updated 3 years ago
- ☆19Jul 6, 2023Updated 2 years ago
- This is the official repo for Gradient Agreement Filtering (GAF).☆24Jan 27, 2025Updated last year
- Implementation of pQRNN in PyTorch☆46Oct 10, 2021Updated 4 years ago
- ☆25Jun 25, 2021Updated 4 years ago
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆61Feb 7, 2022Updated 4 years ago
- A six-dimensional evaluation framework for drama script continuation with interactive leaderboard and case studies☆82Jan 1, 2026Updated 2 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆61Mar 1, 2022Updated 4 years ago
- A Collection of Pydantic Models to Abstract IRL☆37Dec 10, 2025Updated 2 months ago
- This is a repository for my work on the paper "Oracle Guided Image Synthesis with Relative Queries".☆24May 6, 2022Updated 3 years ago
- ☆24Sep 2, 2024Updated last year
- A Benchmark for Efficient and Compositional Visual Reasoning☆25Aug 2, 2023Updated 2 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Nov 30, 2024Updated last year
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆157Dec 20, 2023Updated 2 years ago
- ☆54Jan 18, 2023Updated 3 years ago
- Generate visual podcasts about novels using open source models☆26Feb 15, 2023Updated 3 years ago
- The Codebase for Causal Distillation for Language Models (NAACL '22)☆26May 1, 2022Updated 3 years ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Feb 1, 2023Updated 3 years ago
- [NeurIPS 2022] DataMUX: Data Multiplexing for Neural Networks☆60Nov 24, 2022Updated 3 years ago
- One stop shop for all things carp☆59Sep 9, 2022Updated 3 years ago
- Self-training with Weak Supervision (NAACL 2021)☆163Jul 24, 2023Updated 2 years ago