microsoft / Optimal-Freshness-Crawl-SchedulingLinks
Dataset and code for three Web crawling-related papers from SIGIR-2019, NeurIPS-2019. and ICML-2020.
☆40Updated 11 months ago
Alternatives and similar repositories for Optimal-Freshness-Crawl-Scheduling
Users that are interested in Optimal-Freshness-Crawl-Scheduling are comparing it to the libraries listed below
Sorting:
- Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Sear…☆86Updated 4 years ago
- An Extensible Conversational Information Seeking Platform☆156Updated last year
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- Truly Conversational Search is the next logic step in the journey to generate intelligent and useful AI. To understand what this may mean…☆113Updated 2 years ago
- A multi-stage neural search engine for the COVID-19 Open Research Dataset☆138Updated 2 years ago
- Datasets I have created for scientific summarization, and a trained BertSum model☆116Updated 6 years ago
- A collection of simple tutorials for using Fonduer☆100Updated 5 years ago
- Automatically labeling training data☆107Updated 6 years ago
- ☆99Updated 5 years ago
- ✨ Web interface for NeuralCoref coreference resolution☆34Updated 2 years ago
- Keras Implementation of Flair's Contextualized Embeddings☆26Updated 4 years ago
- Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results☆31Updated 6 years ago
- A web application tagging and retrieval of arguments in text☆29Updated 2 years ago
- Relatively simple text classification powered by spaCy☆41Updated 10 years ago
- A tool for evaluation of semantic similarity measures.☆22Updated 12 years ago
- A DeepWalk implementation for ontologies using NetworkX and Gensim☆19Updated 8 years ago
- Automatically extracting keyphrases that are salient to the document meanings is an essential step to semantic document understanding. An…☆158Updated 2 years ago
- The WebSplit Benchmark introducing "Split and Rephrase" task☆63Updated 7 years ago
- Keras implementation of ontology aware token embeddings☆49Updated 7 years ago
- Performance evaluation of nearest neighbor search using Vespa, Elasticsearch and Open Distro for Elasticsearch K-NN☆117Updated 4 years ago
- Misspelling Oblivious Word Embeddings☆201Updated 6 years ago
- Implementation of GloVe in Keras☆45Updated 2 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtities☆118Updated 5 months ago
- Machine Learning for Information Retrieval☆86Updated 7 months ago
- OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon a…☆20Updated last year
- ALMa (Active Learning Manager) Keeps track of labeled and unlabeled data for active learning☆42Updated 5 years ago
- How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.☆135Updated 3 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆140Updated 3 years ago
- A PyTorch implementation of Google AI's BERT model provided with Google's pre-trained models, examples and utilities.☆30Updated 6 years ago
- KnowledgeNet: A Benchmark Dataset for Knowledge Base Population☆270Updated 4 years ago