xrr233 / Webformer
SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval
☆47Updated 2 years ago
Alternatives and similar repositories for Webformer
Users that are interested in Webformer are comparing it to the libraries listed below
Sorting:
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆38Updated 7 months ago
- An Open-Source Package for Information Retrieval☆162Updated 2 months ago
- Implementation of paper: HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking☆67Updated 2 years ago
- Build Text Rerankers with Deep Language Models☆262Updated last year
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆140Updated 4 months ago
- ☆86Updated last month
- The homepage for ConvSearch Dataset.☆14Updated 2 years ago
- Codebase for RetroMAE and beyond.☆259Updated 11 months ago
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆101Updated 2 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆176Updated 2 years ago
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …☆124Updated 3 years ago
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.☆59Updated 2 years ago
- [EMNLP 2021] The baseline code for WebSRC dataset.☆50Updated last month
- ☆17Updated 3 years ago
- Code repo for ACL22 paper "DeepStruct: Pretraining of Language Models for Structure Prediction"☆84Updated 2 years ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆162Updated last year
- ☆7Updated 2 years ago
- Unofficial Pytorch implementation of Dom-LM paper.☆33Updated 2 years ago
- A multilingual version of MS MARCO passage ranking dataset☆145Updated last year
- The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. …☆140Updated 2 years ago
- 🌳CED: Catalog Extraction from Documents☆16Updated last year
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆105Updated 8 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆157Updated last year
- Scalable training for dense retrieval models.☆292Updated 2 months ago
- EMNLP 2024 Findings "Schema-Driven Information Extraction from Heterogeneous Tables"☆24Updated 5 months ago
- Leveraging large language models for text-to-SQL synthesis, this project fine-tunes WizardLM/WizardCoder-15B-V1.0 with QLoRA on a custom …☆44Updated last year
- [ACL 2022] A hierarchical table dataset for question answering and data-to-text generation.☆85Updated last month
- ☆16Updated 4 years ago
- 🚢 Data Toolkit for Sailor Language Models☆90Updated 2 months ago
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆58Updated 5 months ago