Document preprocessing for preparing formatted input data which is suitable for LibSVM tool.
☆50Mar 10, 2017Updated 9 years ago
Alternatives and similar repositories for document-processor
Users that are interested in document-processor are comparing it to the libraries listed below
Sorting:
- Refactored version for https://github.com/shirdrn/document-processor.git☆15Apr 5, 2017Updated 8 years ago
- 事件抽取☆10Dec 15, 2016Updated 9 years ago
- 天亮中文情感分类器,基于vsm+天亮分词器+多影响因子动态调整开发,正向准确率81%,负向准确率74%☆10Jun 20, 2022Updated 3 years ago
- Session-based Recommendations with Recurrent Neural Networks☆14Dec 14, 2017Updated 8 years ago
- A Simple Http to Raw Socket Adapter for Android☆12Aug 30, 2015Updated 10 years ago
- ☆12Nov 20, 2023Updated 2 years ago
- 项目中为了提高tps(并发量),常采用读写分离的方式。这个demo实现了动态切换数据源的例子(1:用注解,aop实现自动切换;2手动切换)☆13Aug 4, 2023Updated 2 years ago
- akka 中文文档,通过官方文档翻译☆28Mar 16, 2015Updated 11 years ago
- springmvc+phoenix操作hbase的web架构☆10Aug 20, 2018Updated 7 years ago
- ☆13Nov 29, 2018Updated 7 years ago
- Course: DD2412 Deep Learning Advanced at KTH Project by Casper, Magnus, and Friso Focus: Self-supervised learning and computer vision wit…☆11Dec 15, 2023Updated 2 years ago
- ☆11Aug 31, 2015Updated 10 years ago
- Safir Monitor Dashboard (Horizon plugin)☆10Dec 14, 2020Updated 5 years ago
- Trie Filter (敏感词过滤), base from https://github.com/dingyaguang117/DoubleArrayTrie☆16Aug 13, 2015Updated 10 years ago
- Use the knowledge graph generated by GraphRAG as the external knowledge base for the Dify workflow.☆21Jun 4, 2025Updated 9 months ago
- kafka-connect-redis-source☆13Mar 1, 2026Updated 3 weeks ago
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆24Feb 25, 2014Updated 12 years ago
- 录制Spak视频课程讲解涉及编写的源代码 https://edu.hellobi.com/course/107/overview☆13Apr 23, 2019Updated 6 years ago
- naive bayesian,knn java demo☆14Aug 29, 2013Updated 12 years ago
- rank是一个seo工具,用于分析网站的搜索引擎收录排名。☆68May 15, 2017Updated 8 years ago
- Detect duplicated items。内容排重框架。☆11Apr 30, 2015Updated 10 years ago
- 挖掘你的QQ聊天记录☆10May 25, 2017Updated 8 years ago
- java链路层封包协议, 实现数据包完整性校验 可用于无线模块通信 java Data Link Layer protocol☆12May 22, 2017Updated 8 years ago
- A Java version of ftrl algorithm☆24Apr 28, 2017Updated 8 years ago
- FPtree algorithm to mining frequent pattern☆20Aug 6, 2013Updated 12 years ago
- A Nutch 2.2.1 plugin which allows users to shuffle off the responsibility for retrieving pages to a selenium hub/node spoke system. This …☆16Jun 9, 2016Updated 9 years ago
- a project most codes extracting from spark-yarn module make build yarn program more easy☆13Apr 9, 2016Updated 9 years ago
- Open Source Simple Web Crawler for Java. Simple Flexible And Lightweight☆29Sep 1, 2022Updated 3 years ago
- Distributed Factorization Machines and LR with ps-lite☆10Sep 27, 2017Updated 8 years ago
- 语义、情感、相似度分析。☆59Jul 23, 2015Updated 10 years ago
- Recommendation Web Service☆17Apr 17, 2013Updated 12 years ago
- A Jenkins plugin that allows to deploy / stop Apache Spark applications in Spark standalone clusters.☆10Oct 25, 2015Updated 10 years ago
- 关于Spark的源码分析,以及平时工作的一些总结☆31Dec 25, 2015Updated 10 years ago
- Label Studio is a multi-type data labeling and annotation tool with standardized output format☆10Nov 17, 2021Updated 4 years ago
- 基于Yarn的容器调度引擎(container scheduler based on yarn)☆36Apr 5, 2016Updated 9 years ago
- Android UVPN 科学上网神器☆10Apr 6, 2017Updated 8 years ago
- ☆24Jul 20, 2017Updated 8 years ago
- Python bindings for controlling MPI probe stations☆10Mar 10, 2026Updated last week
- A java imsge file for cool manipulation very easily☆49Dec 5, 2012Updated 13 years ago