puntonim / gutenberg-bulk-downloaderLinks
Bulk downloader for free ebooks hosted at Project Gutenberg
☆19Updated 3 years ago
Alternatives and similar repositories for gutenberg-bulk-downloader
Users that are interested in gutenberg-bulk-downloader are comparing it to the libraries listed below
Sorting:
- a python package for cleaning Gutenberg books and dataset☆34Updated 2 months ago
- A simple interface to the Project Gutenberg corpus.☆330Updated 2 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 8 years ago
- Scraper for downloading the entire ebooks repository of project Gutenberg☆151Updated last week
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆98Updated 4 years ago
- 📂 Additional lookup tables and data resources for spaCy☆107Updated last month
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated last month
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- Polyglot is a language identifier for detecting text documents containing text written in more than one language, and for identifying the…☆32Updated 9 years ago
- The Language Learning Toolkit (LLTK) performs a variety of tasks useful for (human) language learning.☆41Updated 5 years ago
- Automatically exported from code.google.com/p/guess-language☆52Updated last year
- Interactive visualization of Wiktionary words and etymologies.☆93Updated last month
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆71Updated last week
- Text tokenization and sentence segmentation (segtok v2)☆205Updated 3 years ago
- A Python module to discover the etymology of words☆150Updated last year
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 4 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- Crawler for linguistic corpora☆205Updated last year
- A simple configurable tool for manipulating dependency trees.☆14Updated 7 months ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Updated 2 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆93Updated last year
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Updated last year
- Lightning Fast Language Prediction 🚀☆167Updated 6 years ago
- Democratizing NLP!☆105Updated last year
- Extract Data from Wikipedia Tables☆34Updated 7 years ago
- SimpleNLG-EnFr 1.1 is a bilingual English/French adaption of SimpleNLG v4.2☆25Updated 7 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- Transliteration module for Indian Languages☆79Updated last year