Ankur3107 / nlp_preprocessing
Text Preprocessing Package includes cleaning, tokenization, dataset preparation ...etc
☆17Updated 4 years ago
Alternatives and similar repositories for nlp_preprocessing:
Users that are interested in nlp_preprocessing are comparing it to the libraries listed below
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆62Updated 10 months ago
- Code and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".☆50Updated 3 years ago
- X-BERT: eXtreme Multi-label Text Classification with BERT☆52Updated 5 years ago
- Low-code pre-built pipelines for experiments with huggingface/transformers for Data Scientists in a rush.☆16Updated 4 years ago
- Bi-encoder Based Entity Linking Tutorial. You can run experiment only in 5 minutes. Experiments on Co-lab pro GPU are also supported!☆34Updated 3 years ago
- An optimized Transformer based abstractive summarization model with Tensorflow☆16Updated 2 years ago
- Dynamic ensemble decoding with transformer-based models☆29Updated last year
- ☆61Updated 4 years ago
- Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging☆34Updated 4 years ago
- On Generating Extended Summaries of Long Documents☆78Updated 4 years ago
- ☆54Updated 3 years ago
- Package for controllable summarization☆78Updated 2 years ago
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆80Updated 2 years ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆63Updated 4 years ago
- Information and data related to the ProtestNews shared task at CASE @ ACL-IJCNLP 2021 workshop☆43Updated 2 years ago
- LongSumm - Scientific Document Summarization Task☆74Updated 2 years ago
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆54Updated 3 years ago
- Kex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public data…☆54Updated 3 years ago
- Multi^2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT (Findings of ACL: EMNLP 2020)☆56Updated 2 years ago
- simple rule based named entity recognition☆43Updated 3 years ago
- Use BERT to Fill in the Blanks☆82Updated 3 years ago
- Experimental code used in pre-training the KBIR and KeyBART models☆26Updated 2 years ago
- ☆41Updated 3 years ago
- Collection of NLP model explanations and accompanying analysis tools☆145Updated last year
- NELA Features for News Veracity. Used in multiple studies.☆10Updated 4 years ago
- ☆13Updated 2 years ago
- Regular spotlights of underrated NLP and Data Science GitHub repositories☆35Updated 4 years ago
- Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statement…☆16Updated 3 years ago
- ☆59Updated 3 years ago
- KitanaQA: Adversarial training and data augmentation for neural question-answering models☆57Updated last year