dk-stationery / stationery-inkView external linksLinks
Distributed SQL base Realtime Streaming Computation Framework On Apache Storm, Spark
☆12Mar 14, 2016Updated 9 years ago
Alternatives and similar repositories for stationery-ink
Users that are interested in stationery-ink are comparing it to the libraries listed below
Sorting:
- api gateway based on netty☆12Jun 14, 2018Updated 7 years ago
- ☆50Feb 11, 2020Updated 6 years ago
- 迁移工具,目标是Oracle,MySQL,SqlServer到PostgreSQL的单项迁移,PostgreSQL和大数据平台Hive,Hbase,Impala等的双向迁移。☆10Dec 3, 2014Updated 11 years ago
- 一个比Spark-Parquet还快5~100倍的存储格式☆12Feb 22, 2016Updated 9 years ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Jun 8, 2016Updated 9 years ago
- Trident State implementation on top of Elasticsearch☆21May 18, 2015Updated 10 years ago
- Examples of using SparklingPandas and Pandas with PySpark☆16Aug 6, 2015Updated 10 years ago
- 基于ActiveMQ的数据交换中间件☆14Aug 17, 2014Updated 11 years ago
- MySQL to NoSQL real time dataflow☆18Oct 14, 2017Updated 8 years ago
- Open-source distribute workflow schedule tools, also support streaming task.☆39Nov 11, 2017Updated 8 years ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆40Jun 29, 2017Updated 8 years ago
- ☆16Oct 27, 2017Updated 8 years ago
- 个性化推荐算法的通用处理框架,基于Mahout和Lucene☆18May 25, 2015Updated 10 years ago
- Apache Hudi Demo☆22Apr 24, 2025Updated 9 months ago
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- Kafka River Plugin for ElasticSearch☆88Jun 19, 2013Updated 12 years ago
- Real-time analytics in Apache Flume☆51Feb 2, 2016Updated 10 years ago
- DataFibers Data Service☆31Feb 11, 2022Updated 4 years ago
- 解析Mysql binlog日志并发至Kafka☆23Nov 25, 2016Updated 9 years ago
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆24Feb 25, 2014Updated 11 years ago
- 多种分词器的封装,重点修改了原IK/MMSeg4j分词器,增加分词器对象共享池和Lucene/Solr封装,其中Lucene/Solr版本为5.5.0。☆29May 5, 2017Updated 8 years ago
- ☆40Aug 3, 2015Updated 10 years ago
- 本项目转移到https://github.com/cocolian/cocolian-nlp☆34Jun 8, 2014Updated 11 years ago
- timetunnel is developed to transfer data realtimely,it is used to collect log data and to sync database data in taobao☆39Jul 24, 2020Updated 5 years ago
- A batch-processing system base on Spring Boot and Spring Batch. 一个基于SpringBoot和SpringBatch的批处理系统。☆10Sep 10, 2018Updated 7 years ago
- Apache Spark based ETL Engine☆71Oct 18, 2016Updated 9 years ago
- Google Cloud Dataflow pipelines such as Identity-By-State as well as useful utility classes.☆37Aug 9, 2023Updated 2 years ago
- LRU on-memory/disk cache for Swift☆34Apr 12, 2016Updated 9 years ago
- Data self exporting and monitoring platform based on Hive data warehouse. https://hc.smartloli.org☆36Jul 28, 2017Updated 8 years ago
- Collection of awesome Swift Snippets compiled with Swift 4☆10Oct 6, 2018Updated 7 years ago
- hadoop中Map/Reduce使用示例,输入(DBInputFormat),输出(DBOutputFormat)为MySql数据库表、日志分析Grep、单词排序Sort...对HBase的基本操作,增、删、查、改,使用Map/Reduce批量导入数据到HBase表中..…☆14Apr 6, 2013Updated 12 years ago
- 这是居于 derby 源代码,通过删减的方式,从里面抽取出sql解析功能。并在此基础上开发出跨库连接查询器。通过该工具可以将连接查询分割成多个单表查询,再将单表结果集进行连接,即将数据库的连接功能上移到工具执行。详情可以查看wiki:readme☆10Feb 14, 2017Updated 9 years ago
- This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.☆10Mar 28, 2019Updated 6 years ago
- flink 10 自我学习笔记和代码☆14Jun 29, 2022Updated 3 years ago
- A framework for building spouts for Apache Storm and a Kafka based spout for dynamically skipping messages to be processed later.☆40Oct 12, 2021Updated 4 years ago
- ☆11Sep 1, 2022Updated 3 years ago
- json或SQL语言转为flink或者spark流/批任务☆12Jun 21, 2022Updated 3 years ago
- Interplanetary Database: A Database built on top of IPFS and made immutable using Ethereum blockchain.☆10Sep 19, 2022Updated 3 years ago
- Spark projects. Learning book "Machine Learning with Spark"☆10Jun 3, 2017Updated 8 years ago