MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
☆1,023Feb 6, 2026Updated 3 weeks ago
Alternatives and similar repositories for Mallet
Users that are interested in Mallet are comparing it to the libraries listed below
Sorting:
- R package wrapping Mallet☆39Jul 21, 2022Updated 3 years ago
- CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.☆10,053Feb 10, 2026Updated 3 weeks ago
- CMU ARK Twitter Part-of-Speech Tagger☆577Dec 17, 2023Updated 2 years ago
- ☆3,172Nov 16, 2021Updated 4 years ago
- Topic Modelling for Humans☆16,371Nov 1, 2025Updated 4 months ago
- Apache OpenNLP☆1,583Feb 23, 2026Updated last week
- FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a suc…☆554Dec 19, 2017Updated 8 years ago
- Latent Dirichlet Allocation (LDA) model for Microblogs (Twitter, weibo etc.)☆321May 4, 2018Updated 7 years ago
- Library for fast text representation and classification.☆26,502Mar 22, 2024Updated last year
- MITIE: library and tools for information extraction☆2,962Sep 28, 2025Updated 5 months ago
- Python library for interactive topic model visualization. Port of the R LDAvis package.☆1,846Dec 4, 2025Updated 3 months ago
- Machine learning components for Apache UIMA☆132Jun 14, 2023Updated 2 years ago
- Scalable, fast, and lightweight system for large-scale topic modeling☆846Dec 28, 2020Updated 5 years ago
- Implementation of CRF (conditional random fiels) and pos-tagger☆79Jan 14, 2017Updated 9 years ago
- Topic modeling with latent Dirichlet allocation using Gibbs sampling☆1,308Jul 29, 2024Updated last year
- A Java package for the LDA and DMM topic models☆85Apr 17, 2019Updated 6 years ago
- An implementation of latent Dirichlet allocation in javascript☆185Aug 1, 2022Updated 3 years ago
- CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, rel…☆480Jul 7, 2023Updated 2 years ago
- Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings☆7,192Jul 27, 2025Updated 7 months ago
- Word2Vec Java Port☆193May 28, 2018Updated 7 years ago
- R package for web-based interactive topic model visualization.☆569Feb 6, 2024Updated 2 years ago
- Sent2Vec encoder and training code from the paper "Skip-Thought Vectors"☆2,052Jun 9, 2020Updated 5 years ago
- Multiview LSA☆11Jun 22, 2015Updated 10 years ago
- topics Models extension for Mallet & scikit-learn☆49Mar 27, 2017Updated 8 years ago
- Improving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)☆179May 8, 2017Updated 8 years ago
- Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.☆202Jan 4, 2026Updated 2 months ago
- Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and …☆14,205Feb 23, 2026Updated last week
- Java Statistical Analysis Tool, a Java library for Machine Learning☆799Dec 16, 2022Updated 3 years ago
- 🦆 Contextually-keyed word vectors☆1,673Apr 23, 2025Updated 10 months ago
- Computation of the semantic interpretability of topics produced by topic models.☆179Apr 19, 2017Updated 8 years ago
- ☆15Aug 22, 2016Updated 9 years ago
- 💫 Industrial-strength Natural Language Processing (NLP) in Python☆33,254Nov 27, 2025Updated 3 months ago
- Quality information extraction at web scale.☆464Dec 27, 2018Updated 7 years ago
- Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allredu…☆8,662Updated this week
- An open-source NLP research library, built on PyTorch.☆11,889Nov 22, 2022Updated 3 years ago
- Scalable Topic Modeling using Variational Inference in MapReduce☆149Oct 20, 2015Updated 10 years ago
- Statistical Machine Intelligence & Learning Engine☆6,344Updated this week
- The Kyoyo Language Modeling Toolkit☆27Nov 27, 2014Updated 11 years ago
- Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statisti…☆1,084Nov 30, 2023Updated 2 years ago