webis-de / cikm20-web-page-segmentation-revisited-evaluation-framework-and-datasetLinks
Code for "Web Page Segmentation Revisited: Evaluation Framework and Dataset", accepted as resources paper to CIKM 2020
☆14Updated 2 years ago
Alternatives and similar repositories for cikm20-web-page-segmentation-revisited-evaluation-framework-and-dataset
Users that are interested in cikm20-web-page-segmentation-revisited-evaluation-framework-and-dataset are comparing it to the libraries listed below
Sorting:
- ☆26Updated 10 months ago
- Implementation of Microsoft Vips algorithm in Python☆18Updated 5 years ago
- SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval☆47Updated 2 years ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆44Updated 3 years ago
- A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!☆93Updated 3 months ago
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆38Updated 8 months ago
- simple rule based named entity recognition☆43Updated 3 years ago
- MultiCite code and data. Models are available on Huggingface.☆32Updated 3 years ago
- Web content extraction using machine learning☆33Updated 4 years ago
- Unofficial Pytorch implementation of Dom-LM paper.☆33Updated 2 years ago
- The dataset includes UI object type labels (e.g., BUTTON, IMAGE, CHECKBOX) that describes the semantic type of an UI object on Android ap…☆52Updated 3 years ago
- Implementation of Vision Based Page Segmentation algorithm in Java☆102Updated 5 years ago
- Web page segmentation and noise removal☆55Updated last year
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆32Updated last year
- Code repo for ACL22 paper "DeepStruct: Pretraining of Language Models for Structure Prediction"☆84Updated 2 years ago
- Measure the readability of a given text using surface characteristics☆79Updated 4 months ago
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated last year
- ☆93Updated 6 months ago
- Compute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.☆58Updated 7 months ago
- FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning. Presented at EACL 2023.☆32Updated last year
- A spaCy wrapper for DBpedia Spotlight☆110Updated 2 years ago
- CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata☆32Updated 3 years ago
- Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering☆30Updated 2 years ago
- A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.☆119Updated 7 months ago
- An easy to use framework for large-scale fact-checking and question answering☆69Updated last year
- code and data used to build a training dataset for dragnet models☆10Updated 4 years ago
- Building knowledge graph from input data☆52Updated 5 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations b…☆28Updated 11 months ago
- Scripts used to make and evaluate OpenAlex's concept tagging model☆49Updated last year