kiasar / Dictionary_crawler
This is a python code based on Scrapy package to crawl famous online dictionaries like Oxford, Longman, Cambridge, Webster, and Collins to make a dataset
☆102Updated last year
Alternatives and similar repositories for Dictionary_crawler:
Users that are interested in Dictionary_crawler are comparing it to the libraries listed below
- This is project convert The Online Plain Text English Dictionary (OPTED) to SQLite database and JSON files☆86Updated 4 years ago
- List of English synonyms and antonyms parsed from the public domain book of James C. Fernald, 1896☆43Updated 6 years ago
- Complete Conjugation of any Verb(e) in Catalan, French, Italian, Portuguese, Romanian or Spanish and conjugate unknown verbs using Machin…☆86Updated 9 months ago
- python3.6+ port of aeneas☆14Updated 3 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated 4 months ago
- convert epub file to txt☆85Updated 4 years ago
- Verb forms dictionary☆65Updated 7 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆243Updated 2 years ago
- Scripts for building a geo-located web corpus using Common Crawl data☆11Updated 3 weeks ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆60Updated 7 months ago
- Fifteen Thousand Useful Phrases, by Greenville Kleiser☆54Updated 8 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆97Updated 3 weeks ago
- Machine-Translation-based sentence alignment tool for parallel text☆308Updated 4 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆89Updated last year
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆123Updated 10 months ago
- Interactive visualization of Wiktionary words and etymologies.☆92Updated last month
- A Python Wiktionary Parser☆357Updated last month
- An NLP pipeline for Hebrew☆37Updated 3 weeks ago
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆72Updated 3 months ago
- Lists of most-frequently-used english words / nouns / verbs etc.☆60Updated 4 years ago
- WordNet in JSON format.☆90Updated 4 years ago
- Extract data from Octopus mdict (*.mdd, *.mdx) files☆23Updated 7 years ago
- Gather modern English word frequencies from all enwiki articles.☆212Updated last year
- A modern, interlingual wordnet interface for Python☆236Updated this week
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆361Updated last year
- api to retrieve word definitions and other info☆64Updated 2 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆122Updated 5 years ago
- A text file containing English words, along with the definition, parts of speech (noun,verb,adjective,etc.), and a link to the url where …☆11Updated 11 months ago
- Faster, modernized fork of the language identification tool langid.py☆55Updated 4 months ago