pgcorpus/gutenberg

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pgcorpus/gutenberg)

pgcorpus / gutenberg

Pipeline to generate the Standardized Project Gutenberg Corpus

☆220

Alternatives and similar repositories for gutenberg

Users that are interested in gutenberg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pgcorpus / gutenberg-analysis
View on GitHub
Analysis of gutenberg dataset
☆44Dec 22, 2018Updated 7 years ago
c-w / gutenberg
View on GitHub
A simple interface to the Project Gutenberg corpus.
☆333Jan 12, 2023Updated 3 years ago
laurejt / authorless-tms
View on GitHub
Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"
☆29May 13, 2020Updated 6 years ago
tuhinjubcse / FigurativeNarrativeBenchmark
View on GitHub
Code and data for TACL paper It’s not Rocket Science: Interpreting Figurative Language in Narratives
☆15Sep 4, 2023Updated 2 years ago
Bookworm-project / Docs
View on GitHub
Documentation for Bookworm: particularly focusing on creation aspects -
☆10Aug 26, 2016Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lwachowiak / Metaphor-Extraction-With-GPT-3
View on GitHub
Code for our ACL'23 paper on how to identify metaphor mappings with the help of GPT-3
☆12May 21, 2025Updated last year
OpenBB-finance / openbb-metricsv2
View on GitHub
Fuels the OpenBB company public metrics
☆18Updated this week
matinho13 / SentiArt
View on GitHub
A simple vector space model based tool for sentiment analysis of literary texts
☆19Sep 17, 2024Updated last year
aparrish / gutenberg-poetry-corpus
View on GitHub
A corpus of poetry from Project Gutenberg
☆218Aug 13, 2018Updated 7 years ago
hugovk / gutenberg-metadata
View on GitHub
Metadata from Project Gutenberg
☆41Jul 6, 2026Updated 2 weeks ago
socius-org / sentibank
View on GitHub
Encyclopedic Hub for Sentiment Dictionaries
☆15Nov 20, 2025Updated 8 months ago
davanstrien / huggingface-tldr
View on GitHub
Experimental tl;dr summaries for datasets on the Hugging Face Hub!
☆10Apr 4, 2024Updated 2 years ago
dbamman / book-nlp
View on GitHub
Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…
☆318Feb 4, 2022Updated 4 years ago
emorynlp / levi-graph-amr-parser
View on GitHub
☆11Nov 16, 2022Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
AxelSorensenDev / Eevee
View on GitHub
An Easy Annotation Tool for Natural Language Processing
☆12May 17, 2024Updated 2 years ago
Hollings / thisrecipedoesnotexist
View on GitHub
Uses the power of M A C H I N E L E A R N I N G to generate recipes
☆10May 25, 2026Updated last month
emptymalei / sci2fi
View on GitHub
从科学到科幻
☆16Sep 25, 2015Updated 10 years ago
Maitreyee1 / Building-LLM-Ground-Up
View on GitHub
This repository is created as part of Sebastian's Raschka's workshop- Building LLMs Ground Up.
☆16Sep 14, 2024Updated last year
hbiaou / openalex-mcp
View on GitHub
An MCP server designed for academic literature research using the OpenAlex free API.
☆15Jun 25, 2025Updated last year
jg-you / sbm_canonical_mcmc
View on GitHub
C++ implementation of a MCMC sampler for the (canonical) SBM
☆11Oct 21, 2019Updated 6 years ago
mirabdullahyaser / LLaMA3-Financial-Analyst
View on GitHub
LLM-powered financial analyst using LoRA-tuned Llama-3 and RAG pipeline to answer complex queries over SEC 10-K filings with contextual a…
☆16Feb 9, 2025Updated last year
c-w / gutenberg-http
View on GitHub
A HTTP interface to the Project Gutenberg corpus.
☆76Aug 25, 2019Updated 6 years ago
computationalstylistics / 100_english_novels
View on GitHub
A benchmark corpus of 100 English novels, covering the 19th and the beginning of the 20th century
☆24Aug 10, 2022Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
mjstrobl / WEXEA
View on GitHub
Wikipedia EXhaustive Entity Annotator (LREC 2020)
☆16Apr 22, 2024Updated 2 years ago
ddhruvkr / Edit-Unsup-TS
View on GitHub
This repo contains the code for our paper "Iterative Edit-Based Unsupervised Sentence Simplification" accepted at ACL 2020.
☆14Jul 19, 2021Updated 5 years ago
XinyuHua / neural-argument-generation
View on GitHub
Project page for "Neural Argument Generation Augmented with Externally Retrieved Evidence"
☆21Apr 24, 2022Updated 4 years ago
booknlp / booknlp
View on GitHub
BookNLP, a natural language processing pipeline for books
☆927Jul 31, 2024Updated last year
joehoover / bragi
View on GitHub
☆19Jun 5, 2023Updated 3 years ago
alvations / SeedLing
View on GitHub
Building and Using A Seed Corpus for the Human Language Project
☆11Feb 9, 2018Updated 8 years ago
rubenvangenugten / autobiographical_interview_scoring
View on GitHub
Using NLP to automatically score autobiographical interview narratives
☆22Updated this week
yunitata / coling2018
View on GitHub
☆13Jun 24, 2019Updated 7 years ago
gambolputty / newscorpus
View on GitHub
A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.
☆20Jul 5, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
data-science-in-ed / Syllabus
View on GitHub
Syllabus for EDCT GE 2550
☆16Oct 3, 2019Updated 6 years ago
abhilasha23 / StoryTelling
View on GitHub
A neural network based StoryTeller that outputs a short story from an input image
☆13Dec 15, 2018Updated 7 years ago
steven-tey / awesome-url-shortener
View on GitHub
🔗 A curated list of awesome url shortener
☆23Jan 22, 2024Updated 2 years ago
kawine / contextual
View on GitHub
How Contextual are Contextualized Word Representations?
☆43Apr 29, 2020Updated 6 years ago
meyersbs / uncertainty
View on GitHub
A Python implementation of the uncertainty classifier, based on the work of Veronika Vincze.
☆17Aug 20, 2024Updated last year
AI4LAM / fastai4GLAMS
View on GitHub
A study group for v4 of the fastai introduction to deep learning course with a focus on applications in GLAM settings
☆15Oct 13, 2021Updated 4 years ago
Anterotesis / historical-texts
View on GitHub
Collections of english historical texts and data relating to them
☆19Mar 24, 2021Updated 5 years ago