rhgarcia / tropescraperLinks
A tropes scraper
☆35Updated 2 years ago
Alternatives and similar repositories for tropescraper
Users that are interested in tropescraper are comparing it to the libraries listed below
Sorting:
- ☆60Updated 2 years ago
- Discourse Analysis Tool Suite☆24Updated this week
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆33Updated 2 years ago
- Scansion tool for Spanish texts☆12Updated last year
- A tool for analyzing the word histories of a text.☆34Updated 7 months ago
- a python package for cleaning Gutenberg books and dataset☆34Updated last month
- WordWanderer – take your text for a walk☆12Updated 6 years ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆108Updated 6 years ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆18Updated 10 months ago
- Gutenberg cache and query library☆37Updated 11 months ago
- CMU Linguistic Annotation Backend☆15Updated last year
- Measure the similarity of text corpora for 74 languages☆13Updated last year
- SFGram (Science-Fiction Gram) is a dataset of public science-fiction novels, books and movie covers. It is designed to be used by researc…☆31Updated 6 years ago
- ☆67Updated last year
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 2 months ago
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆35Updated last year
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated 2 years ago
- The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.☆17Updated 7 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- A News Article Collection Library☆22Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- ☆12Updated 10 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆64Updated last year
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆112Updated 5 months ago
- ☆23Updated 8 months ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- Implementation for WikiCheck API, an open-source Wikipedia-based fact-checking API. The project is done in cooperation with Wikimedia Fou…☆25Updated last year
- SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time☆40Updated 2 years ago