xrr233 / Webformer
SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval
☆47Updated 2 years ago
Related projects: ⓘ
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆36Updated last year
- Unofficial Pytorch implementation of Dom-LM paper.☆30Updated last year
- Implementation of paper: HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking☆65Updated last year
- Build Text Rerankers with Deep Language Models☆245Updated 7 months ago
- An Open-Source Package for Information Retrieval☆145Updated last month
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to any target domain.☆55Updated last year
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …☆118Updated 2 years ago
- A multilingual version of MS MARCO passage ranking dataset☆142Updated 11 months ago
- Codebase for RetroMAE and beyond.☆227Updated 3 months ago
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆91Updated last year
- TAT-QA (Tabular And Textual dataset for Question Answering) contains 16,552 questions associated with 2,757 hybrid contexts from real-wor…☆88Updated last week
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆120Updated 8 months ago
- Code repo for ACL22 paper "DeepStruct: Pretraining of Language Models for Structure Prediction"☆80Updated last year
- Zero-shot Document Ranking with Large Language Models.☆88Updated 2 months ago
- ☆82Updated 3 weeks ago
- TUTA and ForTaP for Structure-Aware and Numerical-Reasoning-Aware Table Pre-Training☆96Updated last year
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆39Updated 2 months ago
- [NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".☆104Updated 4 months ago
- A toolkit for building dense retrievers with deep language models.☆52Updated 2 years ago
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extraction…☆96Updated 2 months ago
- Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.☆176Updated last year
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆27Updated 2 months ago
- ☆47Updated last month
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆104Updated 6 months ago
- This repository is a collection of legal instruction datasets☆12Updated 2 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆98Updated 2 weeks ago
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆47Updated 5 months ago
- ☆17Updated 3 years ago
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP☆19Updated 8 months ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorch☆74Updated 2 months ago