Frequency Counts in Python/v3

Learn how to perform frequency counts using Python.


Note: this page is part of the documentation for version 3 of Plotly.py, which is not the most recent version.
See our Version 4 Migration Guide for information about how to upgrade.

New to Plotly?

Plotly's Python library is free and open source! Get started by dowloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Imports

The tutorial below imports numpy, pandas, and scipy

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF

import numpy as np
import pandas as pd
import scipy

Make the Data

We are generating a 1D dataset from a Weibull Distribution which has the distrubution

$$ \begin{align*} X = \log(U)^{\frac{1}{a}} \end{align*} $$

where $U$ is drawn from the Uniform Distribution.

In [17]:
x=np.random.weibull(1.25, size=1000)
print(x[:10])
[ 0.86317076  0.79217698  2.07432654  0.70721605  0.24102326  1.44261213
  0.85526797  1.0158948   1.19976016  1.78112064]

Histogram

By using a histogram, we can properly divide a 1D dataset into bins with a particular size or width, so as to form a discrete probability distribution

In [21]:
trace = go.Histogram(x=x, xbins=dict(start=np.min(x), size=0.25, end=np.max(x)),
                   marker=dict(color='rgb(0, 0, 100)'))

layout = go.Layout(
    title="Histogram Frequency Counts"
)

fig = go.Figure(data=go.Data([trace]), layout=layout)
py.iplot(fig, filename='histogram-freq-counts')
Out[21]:

Larger Bins

We can experiment with our bin size and the histogram by grouping the data into larger intervals

In [20]:
trace = go.Histogram(x=x, xbins=dict(start=np.min(x), size=0.75, end=np.max(x)),
                   marker=dict(color='rgb(0, 0, 100)'))

layout = go.Layout(
    title="Histogram Frequency Counts"
)

fig = go.Figure(data=go.Data([trace]), layout=layout)
py.iplot(fig, filename='histogram-freq-counts-larger-bins')
Out[20]: