Highlight
Data Lake Storage Gen 2 is the best storage solution for big data analytics in Azure. With its Hadoop compatible access, it is a perfect fit for existing platforms like Databricks, Cloudera, Hortonworks, Hadoop, HDInsight and many more. Take advantage of both blob storage and data lake in one service!
Intro
In this episode I give you introduction to what Azure Data Lake Storage is, how it works and how can you leverage it in your big data workloads. I will also explain the differences between Blob and ADLS.
Agenda
In a short demo I will show you
- What is Data Lake Storage and how it works and why is it called Gen2?
- What does it mean being designed for big data analytical workloads?
- How does multi-protocol access work?
- What are key differences between ADLS and Blob Storage?
- Quick demo of creating ADLS in portal
- Quick demo of connecting from Power BI and using multi-protocol access
- How to use storage explorer with ADLS
- How do Access Control Lists work and how to manage them
- Demo with Databricks and ADLS
Sample code from demo: https://pastebin.com/ee7ULpwx
Video
Next steps for you after watching the video
- Azure Data Lake Storage documentation
- https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction?WT.mc_id=AZ-MVP-5003556
- Transform data using Databricks and ADLS demo tutorial
- https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-extract-load-sql-data-warehouse?WT.mc_id=AZ-MVP-5003556
- More on multi-protocol access
- https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-multi-protocol-access?WT.mc_id=AZ-MVP-5003556
- Read more on ACL
- https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control?WT.mc_id=AZ-MVP-5003556