Long(er) text representation and classification using Doc2Vec embeddings
☆109Jun 17, 2024Updated last year
Alternatives and similar repositories for doc2vec
Users that are interested in doc2vec are comparing it to the libraries listed below
Sorting:
- A Persian Word2Vec Model trained by Wikipedia articles☆10Jan 5, 2018Updated 8 years ago
- Python scripts for training/testing paragraph vectors☆652Sep 10, 2025Updated 5 months ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- Assessing syntactic abilities of BERT☆40Jul 18, 2019Updated 6 years ago
- scripts to populate OpenEventDatabase from public data sources☆22Sep 24, 2025Updated 5 months ago
- Using BERT For Classifying Documents with Long Texts, check my latest post: https://armandolivares.tech/☆41Dec 19, 2019Updated 6 years ago
- Various Algorithms for Short Text Mining☆472Updated this week
- An example on how to train supervised classifiers for multi-label text classification using sklearn pipelines☆110May 24, 2018Updated 7 years ago
- ClickModels for Search Engines Implemented on top of Cython.☆13Jun 9, 2021Updated 4 years ago
- A Persian POS Tagger with LSTM☆13Jul 19, 2022Updated 3 years ago
- A plugin for Dokku that notifies Telegram of deployments.☆13Jan 5, 2018Updated 8 years ago
- ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analys…☆12Jan 26, 2022Updated 4 years ago
- PHP low-level client for Vespa. https://vespa.ai/☆17Jan 22, 2026Updated last month
- ☆12Apr 5, 2019Updated 6 years ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)☆116May 3, 2024Updated last year
- Doc2Vec algorithm for solving moview review sentiment analysis☆25Nov 26, 2015Updated 10 years ago
- Normalized and modified version of Bijankhan corpus☆13Feb 21, 2023Updated 3 years ago
- ☆12Jun 6, 2020Updated 5 years ago
- Using word embeddings, TFIDF and text-hashing to cluster and visualise text documents☆15Nov 7, 2019Updated 6 years ago
- 200,000+ Sentences about Donald Trump with political bias labels☆17Jun 2, 2020Updated 5 years ago
- A brief overview of how to use fastText to train powerful text classifiers in a python notebook.☆15Jun 18, 2017Updated 8 years ago
- Extract Unique Word Lists From Wikipedia Database☆13May 27, 2020Updated 5 years ago
- Scrapes the web. Gets the news.☆13Sep 6, 2016Updated 9 years ago
- The machine learning library you really understand.☆31Nov 30, 2023Updated 2 years ago
- ☆20Dec 16, 2024Updated last year
- Supporting code for Learning to Rank (LTR) presentation☆16Oct 11, 2018Updated 7 years ago
- Software for the paper "Gender and Lexical Variation in Social Media" with David Bamman and Tyler Schnoebelen☆17Nov 10, 2015Updated 10 years ago
- Kong OAuth SSO Integration☆16Aug 23, 2017Updated 8 years ago
- A fork of bitbucket.org/tunystom/rankpy, adapted for Python3 and dmitru/pines☆14Mar 14, 2016Updated 9 years ago
- Tools and services for evaluating topic models☆15Apr 12, 2016Updated 9 years ago
- ☆13Sep 10, 2021Updated 4 years ago
- Document clustering in Python☆30May 24, 2016Updated 9 years ago
- Document Similarity using Word2Vec☆101Jun 21, 2022Updated 3 years ago
- Panoramix is a data exploration platform designed to be visual, intuitive and interactive☆17Jan 26, 2016Updated 10 years ago
- The front end for http://archive.enron.email☆18Oct 8, 2017Updated 8 years ago
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedings☆16Jan 26, 2020Updated 6 years ago
- A dashboard for rq-scheduler jobs☆17Dec 26, 2022Updated 3 years ago
- ☆19Feb 11, 2019Updated 7 years ago
- A text analysis application for performing common NLP tasks through a web dashboard interface and an API☆125Jan 18, 2019Updated 7 years ago