kiasar / Dictionary_crawler
This is a python code based on Scrapy package to crawl famous online dictionaries like Oxford, Longman, Cambridge, Webster, and Collins to make a dataset
☆102Updated last year
Alternatives and similar repositories for Dictionary_crawler:
Users that are interested in Dictionary_crawler are comparing it to the libraries listed below
- List of English synonyms and antonyms parsed from the public domain book of James C. Fernald, 1896☆43Updated 6 years ago
- A modern, interlingual wordnet interface for Python☆235Updated 3 weeks ago
- A Python Wiktionary Parser☆358Updated last month
- convert epub file to txt☆85Updated 4 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- Fifteen Thousand Useful Phrases, by Greenville Kleiser☆54Updated 8 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆23Updated 7 years ago
- A python module for English lemmatization and inflection.☆265Updated last year
- Verb forms dictionary☆64Updated 7 years ago
- Extract and align grammar patterns from English sentences.☆54Updated 2 years ago
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆29Updated last month
- A text file containing English words, along with the definition, parts of speech (noun,verb,adjective,etc.), and a link to the url where …☆10Updated 10 months ago
- The Open English WordNet☆521Updated last month
- Arabic Transliteration in Python☆36Updated 11 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- A python utility for downloading Common Crawl data☆236Updated last year
- Offline database of synonyms/thesaurus☆192Updated last year
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆61Updated 2 weeks ago
- A list of vocabulary lists☆21Updated 4 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆98Updated 2 weeks ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆241Updated 2 years ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆123Updated 9 months ago
- This packages up data for the Open Multilingual Wordnet☆47Updated last week
- ☆72Updated 3 weeks ago
- linguistics backend☆41Updated last year
- Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).☆52Updated last year
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated 3 months ago
- api to retrieve word definitions and other info☆63Updated 2 years ago
- a CSV of every english word, part of speech, and definition. as well as a web scraping script that generates that data for you☆110Updated 2 years ago