piskvorky / gensim-data
Data repository for pretrained NLP models and NLP corpora.
☆983Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for gensim-data
- sentence embedding by Smooth Inverse Frequency weighting scheme☆1,084Updated 5 years ago
- Python Keyphrase Extraction module☆1,562Updated last year
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizati…☆661Updated 8 months ago
- Python scripts for training/testing paragraph vectors☆645Updated last year
- Tensorflow implementation of contextualized word representations from bi-directional language models☆1,620Updated last year
- General purpose unsupervised sentence representations☆1,192Updated 2 years ago
- Super easy library for BERT based NLP models☆1,863Updated 2 months ago
- A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)☆1,093Updated 2 months ago
- A curated list of pretrained sentence and word embedding models☆2,225Updated 3 years ago
- A python tool for evaluating the quality of sentence embeddings.☆2,087Updated 7 months ago
- Toy Python implementation of http://www-nlp.stanford.edu/projects/glove/☆1,252Updated 2 years ago
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,351Updated 5 months ago
- Simple web service providing a word embedding model☆1,433Updated last year
- PyTorch deep learning models for document classification☆595Updated last year
- InferSent sentence embeddings☆2,280Updated 3 years ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,203Updated 9 months ago
- semi supervised guided topic model with custom guidedLDA☆499Updated 4 years ago
- Pre-trained ELMo Representations for Many Languages☆1,463Updated 3 years ago
- Python wrapper for Stanford CoreNLP.☆921Updated 2 years ago
- Code for paper Fine-tune BERT for Extractive Summarization☆1,465Updated 2 years ago
- A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of lang…☆1,505Updated 4 months ago
- NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character …☆1,888Updated 2 years ago
- Semantic Text Similarity Dataset Hub☆715Updated 6 years ago
- Compute Sentence Embeddings Fast!☆618Updated last year
- ☆1,294Updated 2 years ago
- A fast, efficient universal vector embedding utility package.☆1,627Updated last year
- Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx☆627Updated 3 years ago
- 🔡 Token level embeddings from BERT model on mxnet and gluonnlp☆452Updated 4 years ago
- GSDMM: Short text clustering☆353Updated last year
- EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings (official implementation)☆432Updated last year