BramVanroy / spacy_download
Download and load spaCy models on-the-fly
☆14Updated 2 years ago
Alternatives and similar repositories for spacy_download:
Users that are interested in spacy_download are comparing it to the libraries listed below
- Using short models to classify long texts☆21Updated last year
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.☆35Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 11 months ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆58Updated 2 years ago
- Library for fast text representation and classification.☆28Updated last year
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆10Updated last year
- Tool for parsing and converting various span encoding schemes.☆22Updated last year
- ☆21Updated 3 weeks ago
- Generate BERT vocabularies and pretraining examples from Wikipedias☆18Updated 4 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆96Updated 9 months ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆29Updated 2 weeks ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆86Updated last month
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- ☆17Updated last year
- GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning☆27Updated 3 years ago
- sequence tagging with spaCy and crfsuite☆19Updated last year
- My NER Experiments with ModernBERT☆17Updated last month
- Pre-train Static Word Embeddings☆47Updated 3 weeks ago
- ☆30Updated 2 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆22Updated 2 months ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated last month
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- Temporary remove unused tokens during training to save ram and speed.☆22Updated 7 months ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago