nschaetti / SFGram-datasetLinks
SFGram (Science-Fiction Gram) is a dataset of public science-fiction novels, books and movie covers. It is designed to be used by researchers to study the evolution of the science-fiction literature over time and to test machine learning algorithms on authorship attribution and document classification tasks. All the documents are now published o…
☆32Updated 6 years ago
Alternatives and similar repositories for SFGram-dataset
Users that are interested in SFGram-dataset are comparing it to the libraries listed below
Sorting:
- A corpus of poetry from Project Gutenberg☆207Updated 7 years ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆110Updated 6 years ago
- Parse Sentences to extract evoked frames.☆10Updated 6 years ago
- I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this☆223Updated 2 years ago
- Weird A.I. Yankovic neural-net based lyrics parody generator☆84Updated 3 years ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆200Updated last year
- The AI Knowledge Editor☆185Updated 3 years ago
- Analysis of gutenberg dataset☆45Updated 6 years ago
- ☆60Updated 2 years ago
- a python package for cleaning Gutenberg books and dataset☆34Updated 4 months ago
- Code for Deep-speare: a joint neural model of poetic language, meter and rhyme☆78Updated 2 years ago
- Finds linguistic patterns effortlessly☆38Updated 2 years ago
- Libraries, Archives and Museums (LAM)☆85Updated 2 years ago
- MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.☆15Updated 6 years ago
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- A framework-agnostic client-side JavaScript library for logging user interactions on webpages.☆18Updated 3 years ago
- Practical Approaches to Data Science with Text☆39Updated 5 years ago
- Human-free quality estimation of document summaries☆97Updated last year
- Frame Semantic Parser based on T5 and FrameNet☆62Updated last year
- A large scale Humor Dataset, containing more than 550k rated English jokes (LREC'20)☆66Updated 2 years ago
- GPT-2 finetuned on dril twetes☆15Updated 6 years ago
- Materials for PyCon 2020 Workshop, "Nonsense verse... with Python and machine learning"☆30Updated 2 years ago
- Poetic processing, for Python.☆42Updated last year
- Universal Semantic Annotator (LREC 2022)☆17Updated 7 months ago
- Releases for the reddit-graph project☆18Updated last year
- ☆34Updated 2 years ago
- Python SDK for the TextRazor Text Analytics API☆20Updated last year
- ☆193Updated last year
- Python tools for interacting with Wikidata☆154Updated last year
- Homebase of the IPTC EXTRA project about rule-based text categorization☆13Updated 8 years ago