Databricks Secret Scopes: 2 Ways to Create and Manage
Summarize this article with:
✨ AI Generated Summary
Hardcoded credentials in Databricks notebooks create security vulnerabilities that compound over time. When an API key leaks or a database password needs rotation, you're hunting through dozens of notebooks to find and update every instance. One missed reference means broken pipelines or, worse, a security incident.
Databricks secret scopes solve this by centralizing credential storage and providing secure access through a simple API. Instead of embedding passwords in code, you reference secrets that administrators control separately from notebook logic.
This guide covers two methods for creating secret scopes: Databricks-backed scopes for simpler setups and external store-backed scopes for enterprise environments with existing secrets infrastructure.
TL;DR: Databricks Secret Scopes
- Store API keys and passwords outside notebooks to prevent leaks.
- Databricks-backed scopes → simplest setup, no external services.
- External store-backed scopes → use Azure Key Vault / AWS Secrets Manager with existing rotation + audit trails.
- Retrieve secrets with dbutils.secrets.get() — values stay hidden in logs and outputs.
- Manage access at the scope level with READ / WRITE / MANAGE permissions.
- Use groups and separate scopes per environment (dev/staging/prod).
- For secure Databricks pipelines, Airbyte adds encrypted storage, RBAC, and audit logging across 600+ connectors.
What Are Databricks Secret Scopes?
Secret scopes are containers that store sensitive credentials like API keys, database passwords, and authentication tokens. They separate secrets from your notebook code, allowing you to reference credentials without exposing their values.
When you call dbutils.secrets.get(scope="my-scope", key="db-password"), Databricks retrieves the credential and injects it into your code at runtime. The actual value never appears in notebook outputs, logs, or revision history.
Access control happens at the scope level. You grant teams READ access to specific scopes, and they can use any secret within that scope without seeing the underlying values. Administrators with MANAGE permissions handle secret creation and rotation separately.
Which Secret Scope Type Should You Choose?
Databricks offers two backend options for storing secrets. Your choice depends on existing infrastructure, compliance requirements, and operational complexity tolerance.
Databricks-Backed Scopes
Databricks-backed scopes store secrets in Databricks' own encrypted backend. Setup requires no external dependencies, and secrets are managed entirely through Databricks CLI or API.
This option works well for teams without existing secrets infrastructure, smaller deployments, or proof-of-concept projects where speed matters more than centralized management across multiple services.
External Store-Backed Scopes
External store-backed scopes connect to Azure Key Vault, AWS Secrets Manager, or HashiCorp Vault. Secrets live in your existing infrastructure, and Databricks fetches them on demand.
This approach makes sense for enterprise environments where secrets serve multiple services beyond Databricks. Your existing rotation policies, audit trails, and access controls apply automatically.
Comparison: Databricks-Backed vs. External Store-Backed
How Do You Create a Databricks-Backed Secret Scope?
Creating a Databricks-backed scope takes a few minutes through the web UI and CLI. You'll create the scope container first, then add individual secrets.
1. Access the Secret Scope Creation UI
Navigate directly to https://<databricks-instance>#secrets/createScope in your browser. This page isn't linked from the main Databricks UI, so you'll need to access it via the direct URL.
Replace <databricks-instance> with your workspace URL, such as adb-1234567890123456.7.azuredatabricks.net.
2. Configure Scope Settings
Enter a scope name using lowercase letters, numbers, and dashes only. Names like production-db-creds or analytics-api-keys work well.
Set the Manage Principal to control who can administer the scope. Choose "All Users" for shared team access or "Creator" to restrict management to yourself initially. Select "Databricks" as the backend type.
3. Add Secrets via Databricks CLI
Install the Databricks CLI if you haven't already, then configure it with your workspace credentials. Add secrets using the put command:
databricks secrets put --scope production-db-creds --key postgres-password
The CLI prompts you to enter the secret value interactively. This keeps the actual credential out of your shell history.
4. Verify Secret Access
Test the secret in a Databricks notebook:
password = dbutils.secrets.get(scope="production-db-creds", key="postgres-password")
If you print the variable, the output shows [REDACTED] rather than the actual value. This redaction prevents accidental credential exposure in notebook outputs.
How Do You Create an Azure Key Vault-Backed Secret Scope?
Azure Key Vault integration lets you manage secrets in a centralized location that serves both Databricks and other Azure services. Changes in Key Vault reflect immediately in Databricks without manual syncing.
1. Prerequisites
You need an Azure Key Vault instance with secrets already created. Gather the Key Vault DNS name (like my-vault.vault.azure.net) and the Resource ID from the Key Vault's Properties page in the Azure portal.
Your Databricks workspace must be in the same Azure subscription as the Key Vault, or you need cross-subscription permissions configured.
2. Create the Scope via Databricks CLI
Use the CLI with the Azure Key Vault backend type:
databricks secrets create-scope --scope azure-secrets \
--scope-backend-type AZURE_KEYVAULT \
--resource-id "/subscriptions/.../resourceGroups/.../providers/Microsoft.KeyVault/vaults/my-vault" \
--dns-name "https://my-vault.vault.azure.net"3. Configure Key Vault Access Policies
Grant the Databricks service principal GET and LIST permissions on secrets in your Key Vault. In the Azure portal, navigate to your Key Vault, select Access Policies, and add a policy for the AzureDatabricks enterprise application.
If your Key Vault uses Azure RBAC instead of access policies, assign the Key Vault Secrets User role to the Databricks managed identity.
4. Reference Secrets in Notebooks
The API call syntax is identical to Databricks-backed scopes:
api_key = dbutils.secrets.get(scope="azure-secrets", key="my-api-key")
The key name must match the secret name in Azure Key Vault exactly. If you want a specific version rather than the latest, you'll need to manage that in Key Vault itself.
How Do You Create an AWS Secrets Manager-Backed Secret Scope?
AWS Secrets Manager integration follows a similar pattern but requires IAM configuration for authentication between Databricks and AWS.
1. Prerequisites
Create secrets in AWS Secrets Manager and note their names or ARNs. Your Databricks workspace needs an IAM role or instance profile with permissions to read from Secrets Manager.
2. Configure IAM Permissions
Attach an IAM policy granting secretsmanager:GetSecretValue and secretsmanager:ListSecrets actions to the role used by your Databricks cluster. Scope the resource to specific secret ARNs for least-privilege access.
{
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:us-east-1:123456789:secret:prod/*"
}
3. Create Scope via REST API
For AWS-backed scopes, use the Databricks REST API with the appropriate backend configuration. The specifics vary depending on your Databricks deployment (AWS-hosted vs. customer-managed), so consult the Databricks documentation for your setup.
4. Test and Validate
Verify connectivity by retrieving a test secret from a notebook. If you encounter errors, check IAM permissions first since most failures trace back to missing or misconfigured policies.
How Do You Manage Access Control for Secret Scopes?
Secret scope permissions determine who can read secrets, add new ones, or administer the scope itself. Getting this right matters for both security and team productivity.
ACL Permissions Model
Databricks uses three permission levels for secret scopes:
- READ: Retrieve secret values in notebooks
- WRITE: Add or update secrets within the scope
- MANAGE: Full control including granting permissions to others
Permissions apply at the scope level, not individual secrets. If someone has READ access to a scope, they can read any secret within it.
Granting and Revoking Access
Use the Databricks CLI to manage ACLs:
# Grant READ access to a group
databricks secrets put-acl --scope production-db-creds --principal data-engineers --permission READ
# Revoke access
databricks secrets delete-acl --scope production-db-creds --principal former-contractor@company.com
# List current permissions
databricks secrets list-acls --scope production-db-credsBest Practices for Team Environments
- Use groups instead of individual users: Assign permissions to Databricks groups, making onboarding and offboarding simpler
- Separate scopes by environment: Create distinct scopes for dev, staging, and production to prevent accidental production access
- Limit MANAGE permissions: Most users only need READ access; reserve MANAGE for administrators
What Are Common Secret Scope Pitfalls?
Teams new to secret scopes often run into the same issues. Avoiding these mistakes saves debugging time and prevents security gaps.
- Hardcoding scope names: Use notebook widgets or environment variables to make code portable across environments
- Setting "All Users" as manage principal: This grants everyone administrative access to the scope, which defeats the purpose of access control
- Ignoring secret rotation: External stores can automate rotation; Databricks-backed scopes require manual updates that teams often forget
- Mixing environments in one scope: Production secrets accessible from development notebooks is a security incident waiting to happen
- Passing secrets to external systems: Databricks redacts secrets in its own logs, but downstream APIs or services might log the values you send them
How Does Secrets Management Fit Into Data Pipeline Security?
Secret scopes protect credentials for database connections, API integrations, and cloud storage access. They're one component of a broader security posture that includes encryption, RBAC, audit logging, and network isolation.
The challenge: orchestrators, transformation tools, and data integration platforms each handle credentials differently. When evaluating tools that connect to Databricks, look for encrypted credential storage, fine-grained access controls, and audit logging that matches your security requirements.
Building secure data pipelines that connect to Databricks? Airbyte provides encrypted credential storage, RBAC for connection access, and audit logging across 600+ connectors. Try Airbyte to set up your first secure connection.
Frequently Asked Questions
Can I migrate from Databricks-backed scopes to external store-backed scopes?
Not directly. Databricks doesn't support changing the backend type of an existing scope. You'll need to create a new scope with the external store backend, migrate your secrets, update notebook references to point to the new scope, then delete the old scope. Plan for a maintenance window if pipelines depend on these credentials.
What happens if I delete a secret that notebooks are using?
Notebooks will fail with an error when they try to retrieve the deleted secret. Databricks doesn't provide a soft-delete or recycle bin for secrets, so deletion is immediate and permanent. Audit your notebooks and jobs before removing any secrets to identify dependencies.
Are there limits on the number of scopes or secrets I can create?
Databricks enforces limits that vary by workspace tier. Standard workspaces typically allow up to 100 scopes with 1,000 secrets per scope. Premium workspaces have higher limits. Check your workspace's quota settings or contact Databricks support for exact limits on your deployment.
How do I audit who accessed which secrets?
Enable audit logging in your Databricks workspace admin settings. Secret access events appear in the audit logs with the user, timestamp, scope, and key accessed. For external store-backed scopes, you get additional audit trails from Azure Key Vault or AWS CloudTrail depending on your backend.
.webp)
