abhijith-athreya / ASDUS
Automatic Segment Detection using Unsupervised and Supervised Learning is a system which is designed to detect title and prose segments in HTML documents.
☆20Updated 4 years ago
Alternatives and similar repositories for ASDUS:
Users that are interested in ASDUS are comparing it to the libraries listed below
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆62Updated 10 months ago
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆88Updated last year
- Self-supervised NER prototype - updated version (69 entity types - 17 broad entity groups). Uses pretrained BERT models with no fine tuni…☆79Updated 2 years ago
- ☆53Updated 2 years ago
- The NLPStatTest project☆12Updated 2 years ago
- Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging☆34Updated 4 years ago
- PrivacyQA, a resource to support question-answering over privacy policies.☆42Updated 4 years ago
- ☆47Updated 3 years ago
- Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between…☆32Updated last year
- Annotated corpus + evaluation metrics for text anonymisation☆54Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 8 months ago
- CrowdTruth framework for crowdsourcing ground truth for training & evaluation of AI systems☆58Updated 10 months ago
- A framework for adversarial attacks against token classification models☆32Updated 3 years ago
- Repository for the Dynamically Generated Hate Speech Dataset by Vidgen et al. (2021).☆43Updated 3 years ago
- MultiCite code and data. Models are available on Huggingface.☆29Updated 2 years ago
- ☆85Updated 3 years ago
- The dataset and code for ACL 2022 paper "SciNLI: A Corpus for Natural Language Inference on Scientific Text" are released here.☆27Updated last year
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated 9 months ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆83Updated 2 weeks ago
- SegEval Segmentation Evaluation Package☆56Updated last year
- Automatically detect errors in annotated corpora.☆47Updated last year
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago
- Repro is a library for easily running code from published papers via Docker.☆40Updated last year
- Open Source / ENTSUM: A Data Set for Entity-Centric Extractive Summarization☆28Updated 2 years ago
- ☆59Updated 6 months ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆26Updated 4 years ago
- This repository contains a dataset for hate speech detection on social media platforms.☆70Updated 2 years ago
- Repository for the paper "Named Entity Recognition for Entity Linking: What Works and What's Next" (EMNLP 2021).☆75Updated 3 years ago
- ☆22Updated 3 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆68Updated 3 years ago