kevinxiong / epub2txtLinks
convert epub file to txt
☆88Updated 5 years ago
Alternatives and similar repositories for epub2txt
Users that are interested in epub2txt are comparing it to the libraries listed below
Sorting:
- Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).☆52Updated last year
- python based software to unpack kindlegen generated ebooks☆64Updated 2 years ago
- A simple command-line utility for Linux, for extracting text from EPUB documents.☆230Updated 3 months ago
- Python module that identifies Chinese text as being Simplified or Traditional☆93Updated 6 months ago
- Convert epub file to txt☆36Updated last year
- 中文古诗词语料库☆27Updated 8 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆55Updated 10 years ago
- List of English synonyms and antonyms parsed from the public domain book of James C. Fernald, 1896☆43Updated 6 years ago
- 汉字五笔转换工具☆33Updated 6 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆91Updated this week
- The New York Times English-Chinese parallel corpus☆16Updated 3 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Extract data from Octopus mdict (*.mdd, *.mdx) files☆23Updated 7 years ago
- Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT☆23Updated 2 years ago
- Faster, modernized fork of the language identification tool langid.py☆56Updated 6 months ago
- Offline etymological dictionary based on Wiktionary data☆21Updated 3 years ago
- 物种名称语料库。植物名,动物名。☆48Updated last year
- Scripts to auto-OCR PDFs, translate output using publicly-available or DIY NLP translation models, and generate epub/PDF☆43Updated last year
- A metafont-glyphs dataset which facilitate people to define CJK-like glyphs with their metafont scripts by machine learning☆12Updated 7 months ago
- classic Chinese punctuate experiment with keras using daizhige(殆知阁古代文献藏书) dataset☆35Updated 2 years ago
- Latin language dictionaries☆36Updated 4 years ago
- A python module for English lemmatization and inflection.☆268Updated last year
- generate a html or pdf or jpg file for specific words through a mdx dirctionary☆37Updated last year
- pygoogletranslation: Free and Unlimited Google translate API for Python. Translates totally free of charge.☆159Updated 4 years ago
- Measure the readability of a given text using surface characteristics☆79Updated 4 months ago
- 为epub电子书添加词频标记和注释(词典释义)☆15Updated 6 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- Linguistic search for large annotated text corpora, based on Apache Lucene☆112Updated this week
- 图书名语料库。含部分电影、游戏名称。☆71Updated last year
- The multilingual variant of GLM, a general language model trained with autoregressive blank infilling objective☆62Updated 2 years ago