lionsoul2014 / jcseg
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch
☆915Updated last year
Alternatives and similar repositories for jcseg:
Users that are interested in jcseg are comparing it to the libraries listed below
- 这个项目是一个基本包.封装了大多数nlp项目中常用工具☆1,493Updated 9 months ago
- No longer maintained. Please contact the origional author.☆657Updated 6 years ago
- An Efficient Lexical Analyzer for Chinese☆330Updated 7 years ago
- HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统☆296Updated 4 years ago
- ☆637Updated 6 months ago
- Java分布式中文分词组件 - word分词☆1,817Updated 3 years ago
- Java开源项目cws_evaluation:中文分词器分词效果评估对比☆951Updated 7 years ago
- mmseg4j for lucene or solr analyzer☆398Updated 11 months ago
- 一个生产级、高性能、模块化、可扩展的中文NLP工具包。(中文分词、平均感知机、fastText、拼音、新词发现、分词纠错、BM25、人名识别、命名实体、自定义词典)☆679Updated last year
- 结巴分词(java版)☆2,600Updated 6 months ago
- TextRank算法提取关键词的Java实现☆201Updated 9 years ago
- 基于hanlp的elasticsearch分词插件☆158Updated 3 years ago
- 中文语句中的时间语义识别。即通过分析中文语句,识别出话语中提到的时间。☆637Updated last year
- mmseg4j core MMSEG for java chinese analyzer☆156Updated 5 years ago
- Java porting of Darts (Double ARray Trie System)☆267Updated 6 years ago
- A copy of http://sourceforge.net/projects/pinyin4j, then deploy it to maven central repository.☆1,253Updated last year
- ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典☆6,496Updated last year
- jieba analysis plugin for elasticsearch 7.0.0, 6.4.0, 6.0.0, 5.4.0,5.3.0, 5.2.2, 5.2.1, 5.2, 5.1.2, 5.1.1☆532Updated last year
- similarity: Text similarity calculation Toolkit for Java. 文本相似度计算工具包,java编写,可用于文本相似度计算、情感分析等任务,开箱即用。☆1,481Updated last week
- The plugin includes the `jieba` analyzer, `jieba` tokenizer, and `jieba` token filter, and have two mode you can choose. one is `index` w…☆315Updated 3 years ago
- 中文工具集,包括中文简繁体转换、拼音转换以及中文分词。☆181Updated 9 years ago
- Aho-Corasick的Java实现,针对Ascii优化,支持Unicode。☆189Updated 10 years ago
- HanLP Analyzer for Elasticsearch☆836Updated 6 months ago
- 🚲 STConvert is analyzer that convert chinese characters between traditional and simplified. 中文简繁體互相转换.☆362Updated last month
- A configurable web spider with a easy-to-use web console☆991Updated 6 years ago
- Chinese Word Segmentation Tool, THULAC的Java实现.☆85Updated 3 years ago
- A headless,standalone webkit server which make grabing dynamic web page easier.☆225Updated 5 years ago
- QuestionAnsweringSystem是一个Java实现的人机问答系统,能够自动分析问题并给出候选答案。☆1,954Updated 6 years ago
- 自动构建中文词库:http://www.matrix67.com/blog/archives/5044☆648Updated last year
- Tokenizer support Lucene5/6/7/8/9+ version, LTS☆204Updated last year