Azure. Read Data Data Lake Gen2 from Databricks
Azure Databricks has become the tool for analyzing big data, with an Apache Spark environment. With this service you can program through the language of Python, Scala,, R, Java, SQL,
Since we already have the token, we can make the mounting point. We can see that we no longer use the values generated by the main service. ONLY TOKEN
I hope it helps a lot
There are currently four options for connecting from Databricks to ADLS Gen2:
- Using the ADLS Gen2 storage account access key directly
- Using a service principal directly (OAuth 2.0)
- Mounting an ADLS Gen2 filesystem to DBFS using a service principal (OAuth 2.0)
- Azure Active Directory (AAD) credential passthrough
In this article, I will explain point four.
In this way, we can use a Vnet Injection, together with the mounting point.
In the firewall rules, we must include the Vnet, which is part of the Databricks service. You can see the article, Create Databricks service
Cluster Configuration Example
Cluster values for this example:
Python Version.
Cluster Mode
Use of Active Directory, for login
Additionally, the databrick service has to be Premium
let's go to the explanation
You must have a token, to make the mount point.
Since we already have the token, we can make the mounting point. We can see that we no longer use the values generated by the main service. ONLY TOKEN
Now, we can read a file from the storage account
I hope it helps a lot
Comentarios
Publicar un comentario