Peak Finding in Python/v3
Learn how to find peaks and valleys on datasets in Python
Note: this page is part of the documentation for version 3 of Plotly.py, which is not the most recent version.
See our Version 4 Migration Guide for information about how to upgrade.
The version 4 version of this page is here.
See our Version 4 Migration Guide for information about how to upgrade.
The version 4 version of this page is here.
New to Plotly?¶
Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!
In [1]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF
import numpy as np
import pandas as pd
import scipy
import peakutils
Import Data¶
To start detecting peaks, we will import some data on milk production by month:
In [2]:
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
time_series = milk_data['Monthly milk production (pounds per cow)']
time_series = time_series.tolist()
df = milk_data[0:15]
table = FF.create_table(df)
py.iplot(table, filename='milk-production-dataframe')
Out[2]:
Original Plot¶
In [3]:
trace = go.Scatter(
x = [j for j in range(len(time_series))],
y = time_series,
mode = 'lines'
)
data = [trace]
py.iplot(data, filename='milk-production-plot')
Out[3]:
With Peak Detection¶
We need to find the x-axis indices for the peaks in order to determine where the peaks are located.
In [4]:
cb = np.array(time_series)
indices = peakutils.indexes(cb, thres=0.02/max(cb), min_dist=0.1)
trace = go.Scatter(
x=[j for j in range(len(time_series))],
y=time_series,
mode='lines',
name='Original Plot'
)
trace2 = go.Scatter(
x=indices,
y=[time_series[j] for j in indices],
mode='markers',
marker=dict(
size=8,
color='rgb(255,0,0)',
symbol='cross'
),
name='Detected Peaks'
)
data = [trace, trace2]
py.iplot(data, filename='milk-production-plot-with-peaks')
Out[4]:
Only Highest Peaks¶
We can attempt to set our threshold so that we identify as many of the highest peaks that we can.
In [5]:
cb = np.array(time_series)
indices = peakutils.indexes(cb, thres=0.678, min_dist=0.1)
trace = go.Scatter(
x=[j for j in range(len(time_series))],
y=time_series,
mode='lines',
name='Original Plot'
)
trace2 = go.Scatter(
x=indices,
y=[time_series[j] for j in indices],
mode='markers',
marker=dict(
size=8,
color='rgb(255,0,0)',
symbol='cross'
),
name='Detected Peaks'
)
data = [trace, trace2]
py.iplot(data, filename='milk-production-plot-with-higher-peaks')
Out[5]: