Grasia / wiki-scripts
Miscellaneous scripts to gather and process data of wikis.
☆22Updated last year
Alternatives and similar repositories for wiki-scripts:
Users that are interested in wiki-scripts are comparing it to the libraries listed below
- Repository of data and code to use the models described in the paper "Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia…☆10Updated 2 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Ensemble topic modeling with matrix factorization☆25Updated 6 years ago
- Presentations & notebooks from our talks /workshops/meetups/etc☆24Updated 7 years ago
- A thin wrapper around the DBpedia Spotlight HTTP API☆25Updated 7 years ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- Load embeddings and featurize your sentences.☆28Updated 5 months ago
- Build intelligent data-driven applications with minimal effort. Sentence Clustering, Topics Extraction, Text Similarity, Opinion Summariz…☆40Updated 5 years ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 7 years ago
- ☆30Updated 2 years ago
- ☆30Updated 3 months ago
- Read compressed NDJSON .zst files easily☆32Updated 2 years ago
- Calculate readability scores☆40Updated 6 years ago
- Harassment Lexicon and Corpus☆30Updated 6 years ago
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- Construction Grammar based BERT☆13Updated 4 years ago
- sequence tagging with spaCy and crfsuite☆19Updated 2 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 7 years ago
- A tidy and complete archive of metadata for papers on arxiv.org, 1993-2019☆28Updated 5 years ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 6 years ago
- SemEval 2019 Hyperpartisan News Detection - team Bertha von Suttner contribution☆22Updated 5 years ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 6 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Notebooks and data associated to constructing and exploring a map of subreddits.☆55Updated 7 years ago
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Updated last year
- Labeled segmentation for the document structure of printed books☆13Updated 7 years ago
- Python tools for text☆15Updated 4 years ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 6 years ago