Step-by-step Databricks data connection setup
Databricks is a unified analytics platform built on Apache Spark. It combines data engineering, data science, and machine learning workflows in a collaborative environment optimized for large-scale data processing.
Connecting Databricks requires a running SQL Warehouse, a Server Hostname, an HTTP Path, and a Personal Access Token. Follow these steps to connect to your Databricks data and visualize it with Plotly Studio:
Step 1: Ensure your SQL Warehouse is running
Ensure you have a running SQL Warehouse in your Databricks workspace.
Step 2: Retrieve connection details
Navigate to the SQL Warehouses tab in the Databricks console. Select your specific warehouse and click on Connection details. Copy the Server Hostname and HTTP Path.
Step 3: Generate a Personal Access Token
Go to User Settings > Developer. Under Access tokens, click Generate new token. Copy the Personal Access Token (PAT) immediately, as it will not be shown again.
Credentials needed
- Server hostname: e.g., adb-123456789.0.azuredatabricks.net
- HTTP path: e.g., /sql/1.0/warehouses/a1b2c3d4e5f6g7h8
- Personal Access Token: your generated PAT (treat this like a password)
- Catalog/Schema (optional): e.g., samples / nyctaxi
Tip: contact us if you need help troubleshooting these steps.
LLM prompts for connecting to Databricks
Plotly Studio uses an AI agent to generate and execute the data connection code for you. The prompts below are ready to copy and paste directly into Plotly Studio's data connection chat. Use them to establish a connection, query your data, or do both in one shot. The global context rules are worth saving to your Plotly Studio global context to keep Databricks connections consistent across projects.
Connection prompt
Connect to Databricks SQL Warehouse using the databricks-sql-connector Python library. Use
the provided Server Hostname, HTTP Path, and Personal Access Token (PAT) for
authentication. Upon successful connection, list all available catalogs, schemas, and
tables in the workspace.
Query prompt
Using the established Databricks connection, retrieve all rows from the table [CATALOG].
[SCHEMA].[TABLE_NAME]. Execute the query using a cursor and return results as a pandas
DataFrame. Use standard Spark SQL syntax.
Example one-shot prompt
Connect to Databricks SQL Warehouse using databricks-sql-connector with the following
credentials:
Server Hostname: [YOUR_SERVER_HOSTNAME]
HTTP Path: [YOUR_HTTP_PATH]
Personal Access Token: [YOUR_PAT]
Once connected, retrieve all rows from the table [CATALOG].[SCHEMA].[TABLE_NAME] and return
the result as a pandas DataFrame. Display a preview of the data.
Global context rules
Always use databricks-sql-connector as the primary connection library.
Authentication must be performed using a Personal Access Token rather than standard
username/password.
Always verify that the SQL Warehouse is "Running" before attempting to fetch the schema.
Use standard Spark SQL syntax for all queries.
Always return query results as a pandas DataFrame using fetchall() and proper conversion.
Use fully qualified table names in the format CATALOG.SCHEMA.TABLE to avoid ambiguity.
Do not log or expose raw PAT values in any output or error messages.
Troubleshooting and tips
- SQL Warehouse status: The SQL Warehouse must be in "Running" state before connecting. If it's stopped or suspended, start it from the Databricks console before attempting to connect from Plotly Studio.
- Token security: Personal Access Tokens should be treated like passwords. Store them securely and rotate them periodically according to your organization's security policies.
Connect to Databricks in minutes with Plotly Studio
Download today for free and get started with Plotly Studio.
