DistrictDataLabs / baleen
An automated ingestion service for blogs to construct a corpus for NLP research.
β87Updated 6 years ago
Alternatives and similar repositories for baleen:
Users that are interested in baleen are comparing it to the libraries listed below
- Language detection extension for spaCy 2.0+β112Updated 6 years ago
- π« Scripts, tools and resources for developing spaCyβ126Updated 6 years ago
- A Topic Modeling toolboxβ92Updated 9 years ago
- π₯ Browser-based slides or PDFs of our talks and presentationsβ94Updated 6 years ago
- Materials for the workshop Advanced Text Analysis with SpaCy and Scikit-Learn, given at NYU during NYCDH Week 2017, at PyData NYC in Nov.β¦β82Updated 2 years ago
- π« Jupyter notebooks for spaCy examples and tutorialsβ288Updated 6 years ago
- Server/Client around Spacy to load spacy only onceβ46Updated 7 years ago
- Relatively simple text classification powered by spaCyβ41Updated 9 years ago
- Multidimensional data explorer and visualization tool.β56Updated 7 years ago
- A visualisation tool for Spacy using Hierplane.β65Updated 2 years ago
- For extracting measurements and related entities from textβ58Updated 5 years ago
- Similarity search on Wikipedia using gensim in Python.β60Updated 6 years ago
- Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.β105Updated 2 years ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)β115Updated last year
- π€ΉββοΈ Query spaCy's linguistic annotations using GraphQLβ86Updated 6 years ago
- Natural Language Processing with Spark's MLlib