Plot Data from Salesforce in Python/v3

Create interactive graphs with salesforce, IPython Notebooks and Plotly

Note: this page is part of the documentation for version 3 of, which is not the most recent version.
See our Version 4 Migration Guide for information about how to upgrade.

New to Plotly?

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Version Check

Plotly's python package is updated frequently. Run pip install plotly --upgrade to use the latest version.

In [1]:
import plotly


Salesforce reports are great for getting a handle on the numbers but Plotly allows for interactivity not built into the Reports Module in Salesforce. Luckily Salesforce has amazing tools around exporting data, from excel and csv files to a robust and reliable API. With Simple Salesforce, it's simple to make REST calls to the Salesforce API and get your hands on data to make real time, interactive charts.

This notebook walks you through that basic process of getting something like that set up. First you'll need Plotly. Plotly is a free web-based platform for making graphs. You can keep graphs private, make them public, and run Plotly on your own servers ( To get started visit . It's simple interface makes it easy to get interactive graphics done quickly.

You'll also need a Salesforce Developer (or regular Salesforce Account). You can get a salesforce developer account for free at their developer portal.

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go

import pandas as pd
import numpy as np
from collections import Counter
import requests

from simple_salesforce import Salesforce
requests.packages.urllib3.disable_warnings() # this squashes insecure SSL warnings - DO NOT DO THIS ON PRODUCTION!

Log In to Salesforce

I've stored my Salesforce login in a text file however you're free to store them as environmental variables. As a reminder, login details should NEVER be included in version control. Logging into Salesforce is as easy as entering in your username, password, and security token given to you by Salesforce. Here's how to get your security token from Salesforce.

In [2]:
with open('salesforce_login.txt') as f:
    username, password, token = [x.strip("\n") for x in f.readlines()]
sf = Salesforce(username=username, password=password, security_token=token)

SOQL Queries

At this time we're going to write a simple SOQL query to get some basic information from some leads. We'll query the status and Owner from our leads. Further reference for the Salesforce API and writing SOQL queries: SOQL is just Salesforce's version of SQL.

In [3]:
leads_for_status = sf.query("SELECT Id, Status, Owner.Name FROM Lead")

Now we'll use a quick list comprehension to get just our statuses from those records (which are in an ordered dictionary format).

In [4]:
statuses = [x['Status'] for x in leads_for_status["records"]]
status_counts = Counter(statuses)

Now we can take advantage of Plotly's simple IPython Notebook interface to plot the graph in our notebook.

In [5]:
data = [go.Bar(x=status_counts.keys(), y=status_counts.values())]
py.iplot(data, filename='salesforce/lead-distributions')

While this graph gives us a great overview what status our leads are in, we'll likely want to know how each of the sales representatives are doing with their own leads. For that we'll need to get the owners using a similar list comprehension as we did above for the status.

In [6]:
owners = [x['Owner']['Name'] for x in leads_for_status["records"]]

For simplicity in grouping the values, I'm going to plug them into a pandas DataFrame.

In [7]:
df = pd.DataFrame({'Owners':owners, 'Status':statuses})

Now that we've got that we can do a simple lead comparison to compare how our Sales Reps are doing with their leads. We just create the bars for each lead owner.

In [8]:
lead_comparison = []
for name, vals in df.groupby('Owners'):
    counts = vals.Status.value_counts()
    lead_comparison.append(Bar(x=counts.index, y=counts.values, name=name))
In [9]:
py.iplot(lead_comparison, filename='salesforce/lead-owner-status-groupings')

What's great is that plotly makes it simple to compare across groups. However now that we've seen leads, it's worth it to look into Opportunities.

In [10]:
opportunity_amounts = sf.query("SELECT Id, Probability, StageName, Amount, Owner.Name FROM Opportunity WHERE AMOUNT < 10000")
In [11]:
amounts = [x['Amount'] for x in opportunity_amounts['records']]
owners = [x['Owner']['Name'] for x in opportunity_amounts['records']]
In [12]:
hist1 = go.Histogram(x=amounts)
In [13]:
py.iplot([hist1], filename='salesforce/opportunity-probability-histogram')
In [14]:
df2 = pd.DataFrame({'Amounts':amounts,'Owners':owners})
In [15]:
opportunity_comparisons = []
for name, vals in df2.groupby('Owners'):
    temp = Histogram(x=vals['Amounts'], opacity=0.75, name=name)
In [16]:
layout = go.Layout(
fig = go.Figure(data=opportunity_comparisons, layout=layout)
In [17]:
py.iplot(fig, filename='salesforce/opportunities-histogram')

By clicking on the "play with this data!" you can export, share, collaborate, and embed these plots. I've used it to share annotations about data and try out more colors. The GUI makes it easy for less technically oriented people to play with the data as well. Check out how the above was changed below or you can follow the link to make your own edits.

In [18]:
from IPython.display import HTML
    <a href="" target="_blank" title="Chuck vs Bill Sales Amounts" style="display: block; text-align: center;"><img src="" alt="Chuck vs Bill Sales Amounts" style="max-width: 100%;width: 1368px;"  width="1368" onerror="this.onerror=null;this.src='';" /></a>
    <script data-plotly="bill_chambers:21" src="" async></script>
Chuck vs Bill Sales Amounts

After comparing those two representatives. It's always helpful to have that high level view of the sales pipeline. Below I'm querying all of our open opportunities with their Probabilities and close dates. This will help us make a forecasting graph of what's to come soon.

In [19]:
large_opps = sf.query("SELECT Id, Name, Probability, ExpectedRevenue, StageName, Amount, CloseDate, Owner.Name FROM Opportunity WHERE StageName NOT IN ('Closed Lost', 'Closed Won') AND Amount > 5000")
In [20]:
large_opps_df = pd.DataFrame(large_opps['records'])
large_opps_df['Owner'] = large_opps_df.Owner.apply(lambda x: x['Name']) # just extract owner name
large_opps_df.drop('attributes', inplace=True, axis=1) # get rid of extra return data from Salesforce
Amount CloseDate ExpectedRevenue Id Name Owner Probability StageName
0 15000 2015-06-03 9000 0061a000002vYrwAAE Grand Hotels Kitchen Generator Bill C 60 Id. Decision Makers
1 90000 2015-05-03 81000 0061a000002vYsIAAU Grand Hotels SLA Chuck Brockerson 90 Negotiation/Review
2 80000 2015-05-22 60000 0061a000002vYs3AAE Express Logistics Portable Truck Generators Bill C 75 Proposal/Price Quote
3 22000 2015-05-07 11000 0061a000002vYruAAE Express Logistics Standby Generator Chuck Brockerson 50 Value Proposition
4 100000 2015-06-17 90000 0061a000002vYsCAAU University of AZ Installations Bill C 90 Negotiation/Review
In [21]:
scatters = []
for name, temp_df in large_opps_df.groupby('Owner'):
    hover_text = temp_df.Name + "<br>Close Probability: " + + "<br>Stage:" + temp_df.StageName
                size=(temp_df.Probability / 2) # helps keep the bubbles of managable size
In [22]:
data = scatters
layout = go.Layout(
    title='Open Large Deals',
        title='Close Date'
        title='Deal Amount',
fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='salesforce/open-large-deals-scatter')

Plotly makes it easy to create many different kinds of charts. The above graph shows the deals in the pipeline over the coming months. The larger the bubble, the more likely it is to close. Hover over the bubbles to see that data. This graph is ideal for a sales manager to see how each of his sales reps are doing over the coming months.

One of the benefits of Plotly is the availability of features.