sirrice / dbtruckLinks
just put my data in a database!
☆39Updated 9 years ago
Alternatives and similar repositories for dbtruck
Users that are interested in dbtruck are comparing it to the libraries listed below
Sorting:
- ☆92Updated 9 years ago
- Functional, Typesafe, Declarative Data Pipelines☆139Updated 7 years ago
- A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.☆54Updated 10 years ago
- S3 backed ContentsManager for jupyter notebooks☆14Updated 9 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 4 years ago
- Utils around luigi.☆66Updated 2 months ago
- An open-source, vendor-neutral data context service.☆160Updated 7 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 9 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆112Updated 10 years ago
- zenvisage's foundational framework☆69Updated 2 years ago
- A platform for real-time streaming search☆102Updated 9 years ago
- Material for some talks I have given☆62Updated last year
- ☆84Updated 7 years ago
- Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data Engine☆167Updated 4 years ago
- Apache Spark AWS Lambda Executor (SAMBA)☆44Updated 7 years ago
- Looking at big data? Add a little salt.☆59Updated 2 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆116Updated 3 years ago
- SociaLite: query language for large-scale graph analysis and data mining☆110Updated 9 years ago
- JSON -> Relational DB Column Types☆63Updated 2 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆52Updated 8 years ago
- Scheduled task execution on top of AWS Data Pipeline☆43Updated 10 years ago
- Task Orchestration Tool Based on SWF and boto3☆38Updated 7 years ago
- A Cascading Workflow Visualizer☆83Updated 2 years ago
- Pig on Apache Spark☆82Updated 10 years ago
- ☆110Updated 8 years ago
- Functional Airflow DAG definitions.☆38Updated 8 years ago
- FlashX is a collection of big data analytics tools that perform data analytics in the form of graphs and matrices.☆236Updated 5 years ago
- Distributed Streaming Quantiles (for PySpark)☆38Updated 11 years ago
- A Python library for creating fast, repeatable and self-documenting data analysis pipelines.☆242Updated 2 weeks ago
- A Python wrapper over the GraphGen system☆37Updated 8 years ago