Plotly

Big Data for Python

Dash Enterprise is your front-end for horizontally scalable, big data computation in Python.

From Spark to Snowflake, Dask to Datashader...the Python "big data" tech stack has never been more varied or robust.

Dash Enterprise supports turnkey connections to the most popular "big data" backends for Python, including Vaex, Dask, Datashader, RAPIDS, Databricks (PySpark), Snowflake, and Postgres.

In addition, Dash Enterprise ships with battle-tested, plug-and-play demos for best leveraging Dash with each of these technologies.

Scroll below to demo the latest in Python HPC through Dash user interfaces.

Get pricing

Vaex

Vaex is a Pandas-like library that can operate on vastly larger datasets through out-of-core memory mapping.


If you’re working with data that is too large to fit in memory, but you don’t want to go through the hassle of setting up Spark or Dask, give Vaex a try.

This Dash app uses Vaex to explore 117 million rows of data (7GB) in real time.

Dask

Dask is the de facto parallel computing library for Python. Dask is gaining popularity over PySpark because of its relatively low overhead to set up.

If you have a machine with multiple cores and a numerical computing problem that can be parallelized, give Dask a try.

Dash and Dask also work wonderfully with Datashader.

This Dash app uses Dask and Datashader to explore 40 million rows of data in real time.

Datashader

Datashader is an open-source Python library for server-side rendering of big data visualizations.

Dash apps integrate closely with Datashader to visualize big data. When zoomed out, Dash uses Datashader to render the entire “big data” visualization server-side. When zoomed in, Dash switches to Plotly graphing for interactive, high resolution data exploration.

Dash + Datashader can be scaled to 100s of millions of points with Dask and RAPIDS — see the Dask and RAPIDS demos for examples. 

This Dash app uses Datashader to visualize 1 million time series points in 300 lines of Python.

NVIDIA RAPIDS cuDF

cuDF is NVIDIA's Pandas-like library for running dataframe computations in GPU memory. 

If you have access to GPU memory, cuDF is the fastest way to process big data in Python on a single node.

This Dash app uses cuDF to explore 300 million rows in real time.

Databricks

Databricks is the company and commercial platform behind Spark and PySpark.

Dash apps that use PySpark and are deployed on Dash Enterprise can call out to Databricks Spark clusters through the Dash Enterprise Job Queue and the databricks-connect utility.

Unlike Dask or RAPIDS, Spark does not work with Datashader, so there is no way to build interactive dashboards with PySpark that can visualize ~100 million rows of data and upsample or downsample in realtime.

This Dash app uses databricks-connect to explore Yelp reviews on a Databricks Spark cluster

Snowflake

Snowflake is a cloud-only, distributed commercial data warehouse with drivers for Python and R.

BI tools like PowerBI or Tableau are typical front-ends for Snowflake — use Dash when you need an AI front-end for NLP, computer vision, predictive analytics, or deep learning using data stored in Snowflake.

This Dash app performs NLP on product reviews stored in Snowflake - all in less than 1,000 lines of Python.

Postgres

If you’re managing a terabyte or less of tabular data, you may not need Spark, Dask, or Snowflake. Vaex or the Postgres Python driver will do!

Dash Enterprise ships with onboard Postgres and Redis databases to store and cache data for your Dash apps. Both Postgres and Redis are fast and easy to access from Python, R, and Julia.

This Dash app queries NYC complaints stored in Dash Enterprise’s onboard Postgres database.

We're proud to partner with these best-in-class big data Python solutions.

See Dash in action

Sign up for our next Dash Live Weekly demo session to learn more about our Dash Enterprise offering, including industry applications and all the latest tips and features on how to operationalize your data science models.

Please fill all *required* fields