一个比Spark-Parquet还快5~100倍的存储格式
☆12Feb 22, 2016Updated 10 years ago
Alternatives and similar repositories for ya100
Users that are interested in ya100 are comparing it to the libraries listed below
Sorting:
- 延云ydb千亿大数据实时解决方案☆31Mar 5, 2017Updated 8 years ago
- ServiceFramework 示例项目☆10Apr 2, 2016Updated 9 years ago
- ☆14Apr 12, 2022Updated 3 years ago
- 迁移工具,目标是Oracle,MySQL,SqlServer到PostgreSQL的单项迁移,PostgreSQL和大数据平台Hive,Hbase,Impala等的双向迁移。☆10Dec 3, 2014Updated 11 years ago
- Distributed SQL base Realtime Streaming Computation Framework On Apache Storm, Spark☆12Mar 14, 2016Updated 9 years ago
- 基于ActiveMQ的数据交换中间件☆14Aug 17, 2014Updated 11 years ago
- MySQL to NoSQL real time dataflow☆19Oct 14, 2017Updated 8 years ago
- 基于Yarn的容器调度引擎(container scheduler based on yarn)☆36Apr 5, 2016Updated 9 years ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆40Jun 29, 2017Updated 8 years ago
- Apache Hudi Demo☆22Apr 24, 2025Updated 10 months ago
- 个性化推荐算法的通用处理框架,基于Mahout和Lucene☆18May 25, 2015Updated 10 years ago
- conbine flume,spark-streaming and redis for real-time computing☆22Oct 20, 2014Updated 11 years ago
- Factorization Machines on Spark and Glint☆25Nov 7, 2016Updated 9 years ago
- Real-time analytics in Apache Flume☆51Feb 2, 2016Updated 10 years ago
- A tool for translating Scala source code into readable and maintainable Java code☆13Jan 3, 2026Updated last month
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆24Feb 25, 2014Updated 12 years ago
- 解析Mysql binlog日志并发至Kafka☆23Nov 25, 2016Updated 9 years ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Jun 8, 2016Updated 9 years ago
- Easy Task为简单易用的分布式任务调度平台. An elastic distributed job scheduler system☆37Dec 16, 2022Updated 3 years ago
- ☆40Aug 3, 2015Updated 10 years ago
- 本项目转移到https://github.com/cocolian/cocolian-nlp☆34Jun 8, 2014Updated 11 years ago
- A batch-processing system base on Spring Boot and Spring Batch. 一个基于SpringBoot和SpringBatch的批处理系统。☆10Sep 10, 2018Updated 7 years ago
- Prevent your Windows system and monitor from sleeping.☆12Mar 16, 2017Updated 8 years ago
- Google Cloud Dataflow pipelines such as Identity-By-State as well as useful utility classes.☆37Aug 9, 2023Updated 2 years ago
- Data self exporting and monitoring platform based on Hive data warehouse. https://hc.smartloli.org☆36Jul 28, 2017Updated 8 years ago
- demo applications that show how to deploy offline feature engineering solutions to online in one minute with fedb and nativespark☆35Oct 15, 2024Updated last year
- 运满满算法研究和数据开发☆10Nov 13, 2017Updated 8 years ago
- ☆11Feb 15, 2022Updated 4 years ago
- Boost library subset for FireBreath☆13Apr 17, 2017Updated 8 years ago
- hadoop中Map/Reduce使用示例,输入(DBInputFormat),输出(DBOutputFormat)为MySql数据库表、日志分析Grep、单词排序Sort...对HBase的基本操作,增、删、查、改,使用Map/Reduce批量导入数据到HBase表中..…☆14Apr 6, 2013Updated 12 years ago
- json或SQL语言转为flink或者spark流/批任务☆12Jun 21, 2022Updated 3 years ago
- flink 10 自我学习笔记和代码☆14Jun 29, 2022Updated 3 years ago
- 这是居于 derby 源代码,通过删减的方式,从里面抽取出sql解析功能。并在此基础上开发出跨库连接查询器。通过该工具可以将连接查询分割成多个单表查询,再将单表结果集进行连接,即将数据库的连接功能上移到工具执行。详情可以查看wiki:readme☆10Feb 14, 2017Updated 9 years ago
- zdh系列-基于java的经营风控引擎☆13Jan 24, 2026Updated last month
- This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.☆10Mar 28, 2019Updated 6 years ago
- 各种安全相关思维导图整理收集☆11Sep 7, 2015Updated 10 years ago
- ☆11Sep 1, 2022Updated 3 years ago
- Spark projects. Learning book "Machine Learning with Spark"☆10Jun 3, 2017Updated 8 years ago
- DEPRECATED: Element Hiding Helper extension for Adblock Plus☆11Dec 1, 2017Updated 8 years ago