explosion / wikid
Generate a SQLite database from Wikipedia & Wikidata dumps.
β31Updated 7 months ago
Related projects β
Alternatives and complementary repositories for wikid
- π§ͺ Cutting-edge experimental spaCy components and featuresβ95Updated 6 months ago
- Source code and data for Like a Good Nearest Neighborβ28Updated 9 months ago
- π« SpaCy wrapper for ConceptNet π«β88Updated last year
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ62Updated 8 months ago
- β22Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsβ52Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.β57Updated 6 months ago
- spaCy entry points for Curated Transformersβ25Updated last month
- β29Updated 2 years ago
- spaCy match and replace, maintaining conjugationβ34Updated last year
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to iβ¦β46Updated 7 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β151Updated 5 months ago
- β70Updated last year
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linkingβ85Updated 2 years ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.β19Updated 4 months ago
- π A Prodigy plugin for evaluating spaCy pipelinesβ12Updated 7 months ago
- Information extraction from English and German texts based on predicate logicβ135Updated last year
- Finds linguistic patterns effortlesslyβ33Updated last year
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataβ91Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.β22Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidataβ153Updated 2 years ago
- π€ Push your spaCy pipelines to the Hugging Face Hubβ43Updated 5 months ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated 8 months ago
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).β69Updated 3 months ago
- Documentation effort for the BookCorpus datasetβ33Updated 3 years ago
- A Python library aimed at dissecting and augmenting NER training data.β56Updated last year
- Language detection using Spacy and Fasttextβ54Updated 11 months ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.β85Updated last month
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP modelsβ¦β37Updated 2 years ago
- Generate reports for spaCy models.β28Updated 2 years ago