CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training
☆32Jul 20, 2022Updated 3 years ago
Alternatives and similar repositories for CCQA
Users that are interested in CCQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code of CIKM2021 Paper 'Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need'☆16Aug 30, 2021Updated 4 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- Show the time in Roman Numerals☆11Jan 23, 2020Updated 6 years ago
- This is the official code for the paper 'Systematically Exploring Redundancy Reduction inSummarizing Long Documents'.☆16Apr 30, 2021Updated 4 years ago
- TREC QA dataset for question answering cleaned for usage in Question Answering☆14Aug 26, 2019Updated 6 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆35May 24, 2024Updated last year
- Scalable training for dense retrieval models.☆298Jun 10, 2025Updated 9 months ago
- CIKM 2021 Full Paper: FedMatch: Federated Learning Over Heterogeneous Question Answering Data☆12Dec 14, 2021Updated 4 years ago
- Data mapping framework for rust stuff☆49Mar 17, 2026Updated last week
- ☆25Jun 25, 2021Updated 4 years ago
- codebase for the SIMAT dataset and evaluation☆38Feb 16, 2022Updated 4 years ago
- Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference☆44Nov 28, 2022Updated 3 years ago
- GPT as Knowledger Worker (or if you really want, GPT Sorta' Takes the CPA Exam)☆13Jan 24, 2023Updated 3 years ago
- Official library of images for the SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019)☆13Jul 7, 2019Updated 6 years ago
- ☆13Dec 11, 2021Updated 4 years ago
- ☆15Aug 15, 2012Updated 13 years ago
- Synthetic Data Generation for Evaluation☆13Feb 21, 2025Updated last year
- SQL parser and converter☆11Jan 5, 2024Updated 2 years ago
- Anserini notebooks☆69Apr 2, 2023Updated 2 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆74Mar 2, 2024Updated 2 years ago
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Feb 10, 2023Updated 3 years ago
- This is a prototype of a Python module for simple modification of document files.☆18Jan 8, 2022Updated 4 years ago
- Generalised UDRL☆37May 12, 2022Updated 3 years ago
- SIGIR'21: Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track.☆128Feb 15, 2022Updated 4 years ago
- A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…☆47May 29, 2023Updated 2 years ago
- ☆18Jun 10, 2022Updated 3 years ago
- Meta-Analysis of Robust04 Papers (Yang et al., SIGIR 2019)☆12May 25, 2019Updated 6 years ago
- 업무자동화를 위한 Python 강의를 듣고 정리한 자료☆13Oct 10, 2017Updated 8 years ago
- An implementation of GrASP (Shnarch et. al., 2017)☆23Aug 29, 2022Updated 3 years ago
- Fast-Slow Recurrent Neural Networks☆14Jan 31, 2018Updated 8 years ago
- A library for Partially Homomorphic Encryption in Python☆12May 30, 2017Updated 8 years ago
- ☆44Mar 29, 2023Updated 2 years ago
- A Python Interface to Reproducibility Measures of System-Oriented IR Experiments☆11Dec 2, 2025Updated 3 months ago
- Rank-Biased Precision, Overlap, Recall, and Alignment☆12Feb 18, 2025Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆29Sep 26, 2022Updated 3 years ago
- ☆91May 21, 2022Updated 3 years ago
- A library for creating complex experimental pipelines☆12Jul 25, 2022Updated 3 years ago
- Metadata browser of TREC☆10Mar 9, 2026Updated 2 weeks ago
- High-dimensional approximate nearest neighbor in python☆11Sep 18, 2018Updated 7 years ago