☆17Dec 11, 2024Updated last year
Alternatives and similar repositories for ClueWeb22
Users that are interested in ClueWeb22 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆33May 23, 2023Updated 2 years ago
- TREC-COVID results - this is a mirror of data on the TREC website in a more convenient format.☆15Aug 31, 2020Updated 5 years ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- ☆16Mar 25, 2022Updated 4 years ago
- ☆17Jul 18, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…☆39Jun 30, 2024Updated last year
- [CIKM 2023 Oral] This is the code repo for our CIKM‘23 paper "Text Matching Improves Sequential Recommendation by Reducing Popularity Bia…☆40Mar 17, 2024Updated 2 years ago
- Generative Reranker PyTerrier☆18Dec 1, 2025Updated 4 months ago
- A robust web archive analytics toolkit☆135Apr 2, 2026Updated last week
- Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"☆42Dec 9, 2021Updated 4 years ago
- Japanese synonym library☆11Apr 18, 2022Updated 3 years ago
- ☆24Oct 23, 2020Updated 5 years ago
- Zunda: Japanese Enhanced Modality Analyzer client for Python.☆10Nov 30, 2019Updated 6 years ago
- [EMNLP 2022] This is the code repo for our EMNLP‘22 paper "Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder"…☆13Oct 20, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A GPT-powered AI auto scraper for websites. AI Web Scraping made easy.☆14Jun 26, 2023Updated 2 years ago
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆346Dec 16, 2024Updated last year
- "FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning" (ACL 2023)☆15Jul 24, 2023Updated 2 years ago
- this is based on the paper Chain-of-Retrieval Augmented Generation☆14Mar 29, 2025Updated last year
- ☆13Feb 5, 2022Updated 4 years ago
- Toolkit for domain-specific information retrieval experimentation☆19Feb 24, 2026Updated last month
- Resources for the Tutorial on "Utilizing Knowledge Bases in Text-centric Information Retrieval"☆25Sep 18, 2016Updated 9 years ago
- ☆12May 17, 2022Updated 3 years ago
- ☆12Jul 13, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A library for open domain query facet extraction and generation☆16Apr 24, 2024Updated last year
- ☆39Jul 25, 2024Updated last year
- ☆15Jun 9, 2018Updated 7 years ago
- ☆13Dec 21, 2021Updated 4 years ago
- ☆10Jan 12, 2018Updated 8 years ago
- AllenNLP integration for Shiba: Japanese CANINE model☆12Jun 26, 2021Updated 4 years ago
- [EMNLP 2025 Findings] Familiarity-aware Evidence Compression for Retrieval Augmented Generation☆15Aug 20, 2025Updated 7 months ago
- Evaluation tools shared across anserini, pyserini, and pygaggle☆35Mar 19, 2026Updated 3 weeks ago
- WebConf 2020 paper Leading Conversational Search by Suggesting Useful Questions☆33May 4, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval☆26Aug 7, 2023Updated 2 years ago
- facebook link prediction kaggle challenge.☆15Aug 10, 2014Updated 11 years ago
- Trials of pre-trained BERT models for the medical domain in Japanese.☆12Nov 21, 2020Updated 5 years ago
- ☆15Oct 10, 2021Updated 4 years ago
- [ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".☆229Aug 28, 2024Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- docker for UTH-BERT: https://ai-health.m.u-tokyo.ac.jp/uth-bert☆14Mar 24, 2023Updated 3 years ago