xrr233 / WebformerLinks
SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval
☆47Updated 2 years ago
Alternatives and similar repositories for Webformer
Users that are interested in Webformer are comparing it to the libraries listed below
Sorting:
- Implementation of paper: HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking☆72Updated 2 years ago
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆38Updated 9 months ago
- Build Text Rerankers with Deep Language Models☆262Updated last year
- An Open-Source Package for Information Retrieval☆163Updated 4 months ago
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.☆59Updated 2 years ago
- This is the code for our KILT leaderboard submissions (KGI + Re2G models).☆156Updated 2 months ago
- Codebase for RetroMAE and beyond.☆263Updated last year
- Code repo for ACL22 paper "DeepStruct: Pretraining of Language Models for Structure Prediction"☆85Updated 2 years ago
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆101Updated 2 years ago
- TUTA and ForTaP for Structure-Aware and Numerical-Reasoning-Aware Table Pre-Training☆116Updated 8 months ago
- A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.☆124Updated 9 months ago
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.☆104Updated 2 years ago
- The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. …☆144Updated 2 years ago
- Guideline following Large Language Model for Information Extraction☆387Updated 8 months ago
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …☆126Updated 3 years ago
- Dataset and code for EMNLP2020 paper "HybridQA: A Dataset of Multi-Hop Question Answeringover Tabular and Textual Data"☆232Updated 2 years ago
- YuLan-IR: Information Retrieval Boosted LMs☆222Updated last year
- A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on…☆142Updated last year
- [ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".☆226Updated 10 months ago
- ☆109Updated last year
- PyTorch implementation and pre-trained models for ASP - Autoregressive Structured Prediction with Language Models, EMNLP 22. https://arxi…☆106Updated last year
- [ACL-24 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"☆54Updated last year
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆60Updated 7 months ago
- Finetune mistral-7b-instruct for sentence embeddings☆85Updated last year
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆160Updated last year
- The unified platform for data-related resources.☆134Updated 2 years ago
- Document Ranking with Large Language Models.☆169Updated last month
- Scalable training for dense retrieval models.☆299Updated last month
- [ACL 2022] A hierarchical table dataset for question answering and data-to-text generation.☆90Updated 3 months ago
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆308Updated 2 years ago