Data Analysis with Spark and Databricks in Azure Synapse

13/01/2021

Ginger Grant (Desert Isle Group)
Data Analysis with Spark and Databricks in Azure Synapse

Azure Synapse Workspace provides the ability to use both Apache Spark and Databricks. Which one should you use? The answer of course is “It Depends”. In this session we are going to review what the use cases are which would determine why you would select one tool over another. Here we will examine the costs, the kind of data being analyzed, how much data is processed, variability of data loads, and other variables which determine which solution should be implemented. This session will also review when a Spark based processing tool should be used and when you are better off another tool such as SQL on-demand or an Extract, Load and Transform (ELT) process. The demos will show how to implement each solution in the Azure Synapse Workspace and how each can be used to process data.

Learning Objectives:

Understand the use cases for selecting Databricks as the scalable processing solution.
Learn what you need to consider to determine when Apache Spark would be the best choice given your data environment.
Be able to determine when it makes sense to use either Databricks, Apache Spark or when you are better off using an Extract, Load and Transform (ELT) tool.

Speakers

Ginger Grant

Principal Consultant in Advanced Analytics

Ginger Grant manages the consultancy Desert Isle Group and shares what she has learned while working with data technology to people around the world.

As a Microsoft MVP in Data Platform, Microsoft Certified Trainer and an instructor on DataCamp, she focuses on guiding clients to create solutions using the entire Microsoft Data Stack, which includes SQL Server, Power BI, and Azure Data Cloud components.

When not working, she protypes the latest pre-release data technologies, maintains her blog http://www.desertislesql.com, and spends time on twitter @desertislesql.

Subscribe