uchidalab / book-datasetLinks
This dataset contains 207,572 books from the Amazon.com, Inc. marketplace.
☆254Updated 4 years ago
Alternatives and similar repositories for book-dataset
Users that are interested in book-dataset are comparing it to the libraries listed below
Sorting:
- Classification of books based on titles without prior knowledge of context or author☆59Updated 2 years ago
- Generating paper titles (and more!) with GPT trained on data scraped from arXiv.☆149Updated 2 years ago
- GloVe word vector embedding experiments (similar to Word2Vec)☆67Updated 2 years ago
- Toolbox for OCR post-correction☆121Updated 5 years ago
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.☆77Updated 3 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆260Updated 10 months ago
- A tool for converting PDF into hOCR with text, tables, and figures being recognized and preserved.☆450Updated last year
- ☆129Updated 3 years ago
- ☆159Updated 2 years ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆414Updated 5 months ago
- Keras implementation of character-level sequence-to-sequence learning for spelling correction☆74Updated 6 years ago
- Word Embeddings for Information Retrieval☆225Updated last year
- Sentence Classifications with Neural Networks☆237Updated 2 years ago
- find any kind of occupation or job title in a text or file☆84Updated last year
- Atlas: A Dataset and Benchmark for E-commerce Clothing Product Categorization☆76Updated 2 years ago
- Transliteration module for Indian Languages☆78Updated last year
- A python module for English lemmatization and inflection.☆268Updated last year
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated 2 years ago
- Key information extraction from text and graph visualization☆91Updated 5 years ago
- A simple text reuse detection CLI tool.☆135Updated last year
- A Box detection algorithm for any image containing boxes.☆120Updated 5 years ago
- Generate realistic Instagram captions using transformers 🤗☆101Updated 2 years ago
- ☆139Updated last year
- ☆72Updated 7 years ago
- A text analysis application for performing common NLP tasks through a web dashboard interface and an API☆124Updated 6 years ago
- Fixes contractions such as `you're` to `you are`☆318Updated 2 years ago
- Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.☆105Updated 2 years ago
- Document Similarity using Word2Vec☆101Updated 3 years ago
- Getting recommendations from natural language☆123Updated 5 years ago
- Keyword extraction using TextRank algorithm after pre-processing the text with lemmatization, filtering unwanted parts-of-speech and othe…☆114Updated 5 years ago