Dataset and code for three Web crawling-related papers from SIGIR-2019, NeurIPS-2019. and ICML-2020.
☆41Jan 7, 2025Updated last year
Alternatives and similar repositories for Optimal-Freshness-Crawl-Scheduling
Users that are interested in Optimal-Freshness-Crawl-Scheduling are comparing it to the libraries listed below
Sorting:
- ☆34Feb 17, 2021Updated 5 years ago
- website for MS Marco☆34Mar 26, 2025Updated 11 months ago
- R library for common information retrieval metrics☆14Jun 5, 2023Updated 2 years ago
- Experimental search engine in C/C++17 - still in early development.☆27Sep 5, 2025Updated 5 months ago
- The inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative☆11Aug 6, 2025Updated 6 months ago
- Truly Conversational Search is the next logic step in the journey to generate intelligent and useful AI. To understand what this may mean…☆114Jun 12, 2023Updated 2 years ago
- Website for the TREC Deep Learning Track 2019☆86Jun 12, 2023Updated 2 years ago
- Tools relating to the CC-News-En Collection☆20Dec 8, 2023Updated 2 years ago
- Tool for comparing two ranked lists (TREC run files)☆20Nov 9, 2022Updated 3 years ago
- Minimalistic BM25 search engine in C/C++, Java, and nearly 20 other languages☆22Jun 19, 2024Updated last year
- A Test Collection of Computer Science Papers for Faceted Query by Example☆22Nov 28, 2021Updated 4 years ago
- Source code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19☆48Apr 30, 2019Updated 6 years ago
- Zig bindings for the excellent CRoaring library☆38Oct 27, 2025Updated 4 months ago
- Resources for the Tutorial on "Utilizing Knowledge Bases in Text-centric Information Retrieval"☆25Sep 18, 2016Updated 9 years ago
- Generating Questions and Distractors automatically from Multimedia. Undergraduate Thesis work.☆22Feb 7, 2016Updated 10 years ago
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension and question answerin…☆226Jun 12, 2023Updated 2 years ago
- Automatically extracting keyphrases that are salient to the document meanings is an essential step to semantic document understanding. An…☆159Jun 12, 2023Updated 2 years ago
- source code of bison☆26Jul 20, 2020Updated 5 years ago
- Code for CEDR: Contextualized Embeddings for Document Ranking, accepted at SIGIR 2019.☆156Nov 6, 2020Updated 5 years ago
- Multi-stage passage ranking: monoBERT + duoBERT☆110Nov 23, 2020Updated 5 years ago
- Evaluation tools shared across anserini, pyserini, and pygaggle☆35Updated this week
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Jun 21, 2022Updated 3 years ago
- Common Index File Format to to support interoperability between open-source IR engines☆40Sep 19, 2024Updated last year
- Submission archive for the MS MARCO document ranking leaderboard☆31Oct 9, 2023Updated 2 years ago
- WebConf 2020 paper Leading Conversational Search by Suggesting Useful Questions☆33May 4, 2020Updated 5 years ago
- Indri search implementation on top of Lucene search engine☆35Mar 12, 2024Updated last year
- Vector Space Model Framework developed for InPhO☆39May 9, 2025Updated 9 months ago
- ☆89Apr 3, 2025Updated 11 months ago
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆13Aug 17, 2023Updated 2 years ago
- ☆39Nov 21, 2022Updated 3 years ago
- Search COVID-19 Open Research Dataset (CORD-19) using Vespa - the open source big data serving engine.☆38Nov 11, 2025Updated 3 months ago
- ☆86Sep 13, 2023Updated 2 years ago
- My configures and setup when installing a new machine.☆11Jul 30, 2023Updated 2 years ago
- ☆14May 14, 2019Updated 6 years ago
- Free programming language books☆10Jun 4, 2020Updated 5 years ago
- ☆10Jul 6, 2023Updated 2 years ago
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆15May 12, 2022Updated 3 years ago
- Wikimedia Enterprise - client SDK in Python☆20Nov 11, 2025Updated 3 months ago
- Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.☆11Updated this week