maxent-ai / Datasets
datasets with text data for use in NLP, Text analysis, information extraction, ML research.
☆16Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for Datasets
- Build intelligent data-driven applications with minimal effort. Sentence Clustering, Topics Extraction, Text Similarity, Opinion Summariz…☆40Updated 5 years ago
- Generate variations of text through synonym matching☆12Updated 7 years ago
- The objective of this project is to scrape a corpus of news articles from a set of web pages, pre-process the corpus, and then to apply u…☆50Updated 7 years ago
- BERT semantic search engine for searching literature research papers for coronavirus covid-19 in google colab☆31Updated 4 years ago
- Teaching material and other info associated with the Information Extraction using Topic Models tutorial at SciPy US 2018.☆19Updated 6 years ago
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- Multi-Label Text Classification with Transfer Learning☆16Updated 4 years ago
- ☆14Updated 5 years ago
- This is Yunshu's [Activision](https://www.activision.com/) internship project. We are interested in understanding user opinions about Act…☆55Updated 5 years ago
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 5 years ago
- A small repository to test Captum Explainable AI with a trained Flair transformers-based text classifier.☆26Updated 3 years ago
- ☆16Updated last year
- Short Text Topic Modeling notebook example☆12Updated 4 years ago
- Political Discourse Analysis Using Pre-Trained Word Vectors.☆22Updated last year
- Corpus and a baseline neural network system for Named Entity Recognition in Hindi-English Code-Mixed social media text.☆45Updated 4 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 6 years ago
- 🚀GUI for training spaCy models☆53Updated 3 years ago
- Clinical spelling correction with word and character n-gram embeddings.☆74Updated 2 years ago
- Template for AC297r projects☆33Updated 4 years ago
- Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.☆59Updated 7 years ago
- Contains data, format checker, scorer and baselines for the CLEF2020-CheckThat! Task 1.☆20Updated last year
- Detecting Sarcasm on Twitter using both traditonal machine learning and deep learning techniques.☆94Updated 6 years ago
- Detect and extract locations from text or URL page☆21Updated 4 years ago
- Detects if a sentence is in a subjective or objective form☆24Updated last year
- This script uses an ensemble of multiple methods: RAKE, TF-IDF and Automatic Keyword Extraction to obtain top keywords in Reddit posts. P…☆11Updated 7 years ago
- Code related to experimentation of different Text Data Augmentation Techniques☆15Updated 5 years ago
- Question answering system developed using seq2seq and memory network model in Keras☆22Updated 6 years ago
- A previous version of Snorkel focused on information extraction☆34Updated 5 years ago
- Text processing library for sentiment analysis and related tasks☆27Updated 6 years ago
- On Generating Extended Summaries of Long Documents☆77Updated 3 years ago