dataMinds News Round up –
March 2023
Azure
Welcome to Azure Synapse Analytics February update! This month, you’ll find sections on UTF-8 and Japanese Collation support, the General Availability of Spark 3.3, and other features in SQL, Spark, and Data Integration. Ryan Majidimehr walks you through Azure’s newest releases!
GPT-3 is a natural language AI model and it’s able to understand text that you feed into it and then generate now text based on your input. For the integration between Synapse and Azure OpenAI, SynapseML can be used. This open-source library simplifies the creation of scalable Machine Learning pipelines. One of the features it provides is the ability to send requests to Azure Cognitive Services using APIs. It’s this feature that allows to easily access Azure OpenAI from Synapse Analytics.
Enough talking? Thomas Costers joins Stijn Wynants to show how we can make use of Azure OpenAI GPT-3 from within Spark in Synapse Analytics by building a sentiment analysis on restaurant reviews using GPT-3.
Security has many layers and frequently it will determine how you build your process. Liliam Leme starts by reviewing several important security considerations which you can later apply to your Synapse environment.
She provides an overview of the Synapse security environment focused on Dedicated SQL Pool, Serverless SQL Pool, and Spark.
In this article, Bhaskar Sharma will discuss how to physically model an Azure Synapse Analytics data warehouse while migrating from an existing on-premises MPP (Massive Parallel Processing) data warehouse solution like Teradata and Netezza. The approach and methodologies discussed are based on the knowledge and insights gained while actually migrating these data warehouses to Azure Synapse dedicated SQL pool!
SQL
Large databases usually have a negative impact on maintenance time, scalability and query performance. For maintenance, these large single databases have to be backed up daily while the amount of actual changing data might be small. For performance, tables without correct indexes result in full table or clustered index scans. As the data grows the total query time increase linearly. How can we decrease downtime for the maintenance window for large databases and optimize the performance of daily queries? John Miner shines his light.
Power BI
Whether you love PowerPoint or whether you hate it, PowerPoint is one of the most commonly used communication tools in organizations around the world. If you want to reach an audience with information, odds are you’re going to use PowerPoint. The Power BI team just anounced the “storytelling” integration between Power BI and PowerPoint is now generally available (GA) for Power BI customers. Matthew Roche walks the talk.
Have you ever overwritten a Power BI report in the service before realizing that you haven’t edited the latest version of it? In this article, Romain Casteres presents tools and features to support DataOps for Power BI developments with such as Power BI Deployment Pipelines, Azure DevOps using libraries, Azure DevOps using customizable PowerShell scripts & advanced DevOps Pipeline using Tabular Editor to build on the fly the Data Model, run DAX validations queries and run Best Practice Analyzer before publishing validated reviewed artifacts to Production Workspaces.
These are some of the mantras repeated in Power BI circles and content. They are examples of best practices or optimization techniques: things that may result in a better solution when followed. However, each of these exist across a spectrum of how often they apply, and what the measurable impact is on the final solution. The truth is that while they might generally apply to many cases, they don’t always apply to every case. It depends on the data, the model, or even the users and use-case. So what does it mean for your solution to follow best practices, and should it? How can we successfully optimize the things we’ve made in Power BI?
In this article, Kurt Buhler explains the difference between a best practice and optimization techniques, and how to apply them to your Power BI solution. We’ll discuss the dangers of treating an optimization as a best practice, and share an approach to tackle complex problems, illustrating with examples how optimizations should be applied to a Power BI dataset or report.
Learning
Starting your journey to learn SQL? Adam gives you some resources to quickly ramp up with T-SQL for Azure SQL Database and Azure Synapse Analytics!