twosigma/flint

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/twosigma/flint)

twosigma / flint

A Time Series Library for Apache Spark

☆1,172

Alternatives and similar repositories for flint

Users that are interested in flint are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sryza / spark-timeseries
View on GitHub
A library for time series analysis on Apache Spark
☆1,197Oct 13, 2020Updated 5 years ago
twosigma / beakerx
View on GitHub
Beaker Extensions for Jupyter Notebook
☆2,893Dec 4, 2023Updated 2 years ago
databricks / tensorframes
View on GitHub
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
☆744Jul 30, 2024Updated last year
databricks / spark-sklearn
View on GitHub
(Deprecated) Scikit-learn integration package for Apache Spark
☆1,071Dec 3, 2019Updated 6 years ago
databricks / koalas
View on GitHub
Koalas: pandas API on Apache Spark
☆3,372Mar 20, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
combust / mleap
View on GitHub
MLeap: Deploy ML Pipelines to Production
☆1,539Updated this week
typelevel / frameless
View on GitHub
Expressive types for Spark.
☆898Updated this week
vegas-viz / Vegas
View on GitHub
The missing MatPlotLib for Scala + Spark
☆729Jan 30, 2022Updated 4 years ago
man-group / arctic
View on GitHub
High performance datastore for time series and tick data
☆3,088Apr 8, 2024Updated 2 years ago
databricks / databricks-accelerators
View on GitHub
Accelerate the use of Databricks for customers [public repo]
☆15Dec 4, 2019Updated 6 years ago
scalanlp / breeze
View on GitHub
Breeze is/was a numerical processing library for Scala.
☆3,455Oct 4, 2025Updated 9 months ago
apache / incubator-toree
View on GitHub
Mirror of Apache Toree (Incubating)
☆751Jul 17, 2026Updated last week
databricks / spark-deep-learning
View on GitHub
Deep Learning Pipelines for Apache Spark
☆1,989Mar 30, 2023Updated 3 years ago
cloudera / livy
View on GitHub
Livy is an open source REST interface for interacting with Apache Spark from anywhere
☆1,008Oct 5, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
holdenk / spark-testing-base
View on GitHub
Base classes to use when writing tests with Spark
☆1,553Apr 20, 2026Updated 3 months ago
microsoft / SynapseML
View on GitHub
Simple and Distributed Machine Learning
☆5,233Jul 6, 2026Updated 2 weeks ago
filodb / FiloDB
View on GitHub
Distributed Prometheus time series database
☆1,468Updated this week
databricks / spark-tfocs
View on GitHub
A Spark port of TFOCS: Templates for First-Order Conic Solvers (cvxr.com/tfocs)
☆90Apr 15, 2024Updated 2 years ago
databricks / spark-corenlp
View on GitHub
Stanford CoreNLP wrapper for Apache Spark
☆419Nov 15, 2018Updated 7 years ago
h2oai / sparkling-water
View on GitHub
Sparkling Water provides H2O functionality inside Spark cluster
☆979Nov 5, 2025Updated 8 months ago
amplab / keystone
View on GitHub
Simplifying robust end-to-end machine learning on Apache Spark.
☆473Apr 18, 2017Updated 9 years ago
spark-jobserver / spark-jobserver
View on GitHub
REST job server for Apache Spark
☆2,837Mar 3, 2026Updated 4 months ago
spark-notebook / spark-notebook
View on GitHub
Interactive and Reactive Data Science using Scala and Spark.
☆3,142May 16, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
linkedin / photon-ml
View on GitHub
A scalable machine learning library on Apache Spark
☆797Aug 30, 2021Updated 4 years ago
yahoo / TensorFlowOnSpark
View on GitHub
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
☆3,846Jul 10, 2023Updated 3 years ago
blue-yonder / tsfresh
View on GitHub
Automatic extraction of relevant features from time series:
☆9,277Jul 6, 2026Updated 2 weeks ago
TIBCOSoftware / snappydata
View on GitHub
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…
☆1,032Nov 21, 2022Updated 3 years ago
Ackuq / spark-pit
View on GitHub
Point-in-Time optimizations for Apache Spark
☆30Jan 18, 2024Updated 2 years ago
RJT1990 / pyflux
View on GitHub
Open source time series library for Python
☆2,136Oct 24, 2023Updated 2 years ago
twitter / algebird
View on GitHub
Abstract Algebra for Scala
☆2,299Nov 21, 2025Updated 8 months ago
saddle / saddle
View on GitHub
SADDLE: Scala Data Library
☆508Mar 21, 2020Updated 6 years ago
graphframes / graphframes
View on GitHub
GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs
☆1,194Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
spotify / featran
View on GitHub
A Scala feature transformation library for data science and machine learning
☆475Feb 7, 2025Updated last year
intel / ipex-llm
View on GitHub
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V,…
☆8,870Jan 28, 2026Updated 5 months ago
salesforce / TransmogrifAI
View on GitHub
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows…
☆2,277Jun 2, 2026Updated last month
facebook / prophet
View on GitHub
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
☆20,325May 8, 2026Updated 2 months ago
Stratio / sparta
View on GitHub
Real Time Analytics and Data Pipelines based on Spark Streaming
☆530Oct 24, 2019Updated 6 years ago
uber / petastorm
View on GitHub
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet f…
☆1,888Jan 2, 2026Updated 6 months ago
wesm / feather
View on GitHub
Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow
☆2,757Dec 8, 2025Updated 7 months ago