Explore the new agentic workflow in Plotly Studio. ➡️ Register for the overview session.

blog icon darkBlog_icon_purple

Blog

doc_icon_darkDoc icon purple

Docs

login_icon_darkLogin icon purple

Log In

author photo

Juan Torres

May 12, 2026

Visualizing the Concentration of Banking with Plotly

Author: Juan Torres

The Concentration of Banking app investigates the concentration of banking as a structural tendency is the U.S. economic system. We can define the Concentration of Banking as a property of capital to centralize increasingly around a few financial institutions. The app asserts with hard data that the U.S. banking sector is not a marketplace of many competitors drifting toward equilibrium. It is a system that concentrates. A handful of institutions keep absorbing the rest, and that handful keeps getting bigger.

This isn't a sentiment; it's a measurable phenomenon. The challenge was making it visible. That is the job of the Plotly visualizations that power the app, and that is what this article is really about. The data engineering breakdown behind each figure is covered in the accompanying YouTube video series.

To study the concentration of banking quantitatively, the project investigates two properties:

  • Concentration of consolidated assets among large commercial banks
  • Mergers and acquisitions activity of the Big Four (Chase, Bank of America, Citibank, Wells Fargo)

Our Plotly figures answer the following research question from different quantitative and qualitative angles: What is the tendency in the concentration of banking in the U.S. economy?

Concentration of Consolidated Assets

Consolidated assets are the total assets of a bank and all its subsidiaries. As such, it’s a pristine indicator of their wealth. This information is organized and collected by the Board of Governors of the Federal Reserve System on a quarterly basis. I developed a data pipeline to web scrape, process and visualize the data generated by the Federal Reserve. This source allows us to see three powerful angles on the concentration of consolidated assets:


Linear Regression Analysis

linear regression plotly

This linear regression line makes the following statement:

“The Big Four Banks accumulate roughly $20.89 billion in consolidated assets per quarter.”

The regression line fits the combined quarterly assets of Chase, Bank of America, Citibank, and Wells Fargo from Q3-2003 to Q3-2025.

This chart is a Plotly scatter (px.scatter) overlaid with a regression line produced by a scikit-learn LinearRegression model. Each dot is one bank's consolidated assets for one quarter. The white line is the ordinary-least-squares fit across the full 22-year window.

The equation in the corner:

f(x) = 20.89Bx + 640.42B
 
is the punchline. It says: on average, the Big Four accumulate about $20.89 billion in consolidated assets every quarter, starting from a baseline intercept of $640.42 billion back in 2003. That slope is not a forecast or a projection. It is the empirical rate at which these four institutions have been absorbing wealth for the last two decades.

Why linear regression is the right visualization here. The concentration of banking is a long-run tendency. Any single quarter tells you nothing: assets move on shocks, rate cycles, and accounting conventions. What you need is something that filters out the noise and exposes the underlying tendency.

That is exactly what a regression line does. Plotly's scatter mode let's all four banks remain distinguishable by color (Chase blue, Bank of America red, Citibank light blue, Wells Fargo yellow), while the fitted line collapses their aggregate trajectory into a single interpretable statement. The math proves the phenomenon; the visualization makes the proof readable without a PhD in econometrics.

It also makes the phenomenon reproducible. Any skeptic can re-run the fit and end up with essentially the same slope. A linear regression isn't just a chart, it is an open audit of the claim. Analysts can reproduce the work through the Google Collab Notebook I shared in my data engineering video series with Plotly.

Treemap Analysis

Treemap plotly

This treemap states: the Big Four Banks collectively hold roughly 43% of all consolidated assets among large commercial banks in the U.S. as of Q3-2025 quarter.

  • Chase - 16.35%; $3.81 trillion
  • Bank of America – 11.35%; $2.65 trillion
  • Citibank – 7.91%; $1.84 trillion
  • Wells Fargo – 7.57%; $1.77 trillion
  • All other larger commercial banks – 56.81%; $13.3 trillion

Where the regression line shows change over time, the treemap (px.treemap) shows state right now. Four banks alone occupy almost half of the total share of consolidated assets while hundreds of other banks occupy a small insignificant amount of these assets. That is concentration rendered as area.

Why a treemap is the right visualization here. Treemaps encode proportion by area. The human visual system is excellent at comparing rectangles; it is much worse at comparing numbers in a table. When the goal is to communicate "four institutions control a share disproportionate to their count," a treemap does the work in one glance that a paragraph of percentages cannot. Plotly's treemap also supports hierarchical nesting (the "Big Four" parent rectangle encompassing the four individual bank cells), which reinforces the point that these four entities act, effectively, as a bloc.

Line Plot

line plot banking article

The purple line is the total consolidated assets of all large commercial banks in the U.S.; the white line is the combined assets of the Big Four. Both rise sharply, but the gap tells a story on its own.

This is a straightforward Plotly line chart (px.line) with two traces. It's deceptively simple, and that is precisely its strength: it lets you see two things at once.

The first reading is absolute growth. Total large-bank assets rose from roughly $6.5 trillion in 2003 to an all-time high of $22.75 trillion by the first quarter of 2025. The Big Four's share rose in parallel, from about $2 trillion to nearly $10 trillion. Both groups got dramatically richer. The 2020 jump – roughly $1.5 trillion transferred to large commercial banks in a single quarter during the COVID-era liquidity expansion – is the clearest example of a crisis accelerating concentration rather than correcting it. The second reading is the ratio. Back in Q3 2003, the Big Four held 30.64% of all large-commercial-bank consolidated assets. By Q1 2025, that share had grown to 42.79%. So even though both groups saw historic asset growth over two decades, the Big Four grew faster in proportional terms. Concentration did not merely persist through a period of general expansion; it intensified during that period.

Why a line plot is the right visualization here. A line plot is the canonical form for showing a quantity against time, and showing two quantities against the same time axis makes the comparison visceral. Additionally, the hover tooltips reveal exact dollar values and percentage shares for any quarter. This plot states that the concentration of banking is a general tendency within the banking industry AND it tends to centralize around a few financial institutions (i.e. the Big Four).

Activities of Mergers and Acquisitions of the Big Four

scatter plot banking

The second property of the concentration of banking is the activities of mergers and acquisitions of the Big Four banks. Through the consolidated assets, we were able to quantify the concentration of banking. But through the activities of mergers and acquisitions we qualify the mechanics of the pivotal moments that lead to this wealth accumulation. Sourced from the Federal Financial Institutions Examination Council, the data is a download from the csv datasets in "Relationships.csv."

As a qualitative interactive historical analysis, the network plot paints a progressive timeline of absorption and consolidation around the center of the Big Four. Each dot is a predecessor institution that no longer exists as an independent entity; each line is an edge pointing to the surviving Big Four bank at the far right of the plot along the x-axis of time. That convergence is the concentration of banking rendered as a graph.

Also, annotated by symbolic canonical events for the events, I included the years in which the banks went through a name transformation. For example, in the Chase network plot, two dashed verticals mark key renaming events:

  • Chemical Bank → Chase Manhattan (1996) and
  • Chase Manhattan → JPMorgan Chase (2001)

Plotly combined with networkx makes each node interactive: hovering reveals the predecessor's name, the year of the transformation, the regulatory transformation code, the accounting method (merger, acquisition, charter discontinued), and any notes of the merger/acquisition.

Why Plotly

Each of the four visualization types used in this project answers a distinct question:

  • Linear regression scatter answers "at what rate is concentration happening?" A quantitative, mathematical claim backed by an equation.
  • Treemap answers "what does concentration look like right now?" A snapshot of state, with proportion encoded as area.
  • Line plot answers "how have the Big Four's share and the total pie both evolved?" A two-trace comparison over time.
  • Network plot answers "through what sequence of absorptions did we arrive here?" A relational, interactive history.

What makes Plotly the right framework for all four is that it does not force you to choose between exploratory and explanatory. The regression line communicates a finished statistical finding, but the hover tooltip on every dot lets a reader verify it bank-by-bank, quarter-by-quarter. The treemap is a snapshot, but it is also a drillable hierarchy. The line plot is a clean two-trace comparison, but the hover values make every percentage share inspectable. The network plots are the clearest case: without interactivity they would be decorative; with it, they are a queryable historical record.

For a research topic where the core claim is "this is a structural property of the system, not a story about any one deal," that combination matters. The visualizations stand up to skepticism because each one is simultaneously a summary and an invitation to check and replicate the study through the underlying data yourself. That is what data visualization is supposed to do, and it is why every chart on the Concentration of Banking app is built with Plotly.

Explore the project


If you want to replicate any of the figures above, the entire Dash application source, including the data-loading pipelines, the regression fit, and the network-plot generator functions, is public on GitHub. The replication of the treemap and linear regression analysis is executable through the following Google Collab Notebook.

Bluesky icon
X icon
Instagram icon
Youtube icon
Medium icon
Facebook icon

Product

© 2026
Plotly. All rights reserved.
Cookie Preferences