Azure Databricks Security Best Practices

Azure Databricks is a Unified Data Analytics Platform that is a part of the Microsoft Azure Cloud.

Built upon the foundations of Delta Lake, MLflow, Koalas and Apache SparkTM, Azure Databricks is a first party PaaS on Microsoft Azure cloud that provides one-click setup, native integrations with other Azure cloud services, interactive workspace, and enterprise-grade security to power Data & AI use cases for small to large global customers.

The platform enables true collaboration between different data personas in any enterprise, like Data Engineers, Data Scientists, Business Analysts and SecOps / Cloud Engineering.

In this article, we will share a list of cloud security features and capabilities that an enterprise data team could utilize to bake their Azure Databricks environment as per their governance policy.

Azure Databricks Security Best Practices Security that Unblocks the True Potential of your Data Lake Learn how Azure Databricks helps address the challenges that come with deploying, operating and securing a cloud-native data analytics platform at scale.

Bring Your Own VNET What does the Azure Databricks platform architecture look like, and how you could set it up in your own enterprise-managed virtual network, in order to do necessary customizations as required by your network security team.

Trust But Verify with Azure Databricks Get visibility into relevant platform activity in terms of who’s doing what and when, by configuring Azure Databricks Diagnostic Logs and other related audit logs in the Azure Cloud.

Securely Accessing Azure Data Sources from Azure Databricks Understand the different ways of connecting Azure Databricks clusters in your private virtual network to your Azure Data Sources in a cloud-native secure manner.

Data Exfiltration Protection with Azure Databricks Learn how to utilize cloud-native security constructs to create a battle-tested secure architecture for your Azure Databricks environment, that helps you prevent Data Exfiltration.

Most relevant for organizations working with personally identifiable information (PII), protected health information (PHI) and other types of sensitive data.

Enable Customer-Managed Keys with Notebooks Azure Databricks notebooks are stored in the scalable management layer powered by Microsoft, and are by default encrypted with a Microsoft-managed per-workspace key.

You could also bring your own key to encrypt the notebooks.

Simplify Data Lake Access with Azure AD Credential Passthrough Control who has access to what data by using seamless identity federation with Azure AD under the hood, and get cloud-native visibility into who is processing the data and when.

Please feel free to refer to cloud-native access control for ADLS Gen 2 and how to configure it using Azure Storage Explorer.

Such access management controls, including role-based access controls, are seamlessly utilized by Azure Databricks as outlined in the passthrough article.

Azure Databricks is HITRUST CSF Certified Azure Databricks is HITRUST CSF Certified to meet the required level of security and risk controls to support the regulatory requirements of our customers.

It is in addition to the HIPAA compliance that’s applicable through Microsoft Azure BAA.

What’s Next? Attend the Azure Databricks Security Best Practices Webinar and bookmark this page, as we’ll keep it updated with the new security-related capabilities & controls.

If you want to try out the mentioned features, get started by creating an Azure Databricks workspace in your managed VNET.

Try Databricks for free.

Get started today.

Leave a Reply