dataMinds News Round up – February 2021
Databricks
Amog Kamsetty and Archit Kulkarni write about two new integrations with Ray and MLflow: Ray Tune+MLflow Tracking and Ray Serve+MLflow Models, which together make it much easier to build machine learning (ML) models and take them to production.
The financial services industry (FSI) is rushing towards transformational change, delivering transactional features and facilitating payments through new digital channels to remain competitive. Unfortunately, the speed and convenience that these capabilities afford also benefit fraudsters. Sri Ghattamaneni, Ricardo Portilla and Nikhil Gupta combine rule-based and AI Models to combat Financial Fraud.
Azure Data Factory
The Azure Data factory team announced a new update to the ADF data wrangling feature, which is currently in public preview. With Power Query embedded in ADF, you can now use the PQ editor to explore and profile data as well as turn your M queries into scaled-out data prep pipeline activities. Microsoft’s Mark Kromer guides you through!
Both Azure Data Factory and Azure Databricks offer transformations at scale when it comes to ELT processing. On top of that, ADF allows you to orchestrate the whole solution in an easy way. In a case, when you prefer to use Scala, Python or SQL code in your process, rather than Mapping Data Flow in ADF – you must link ADF to Databricks. Looking into a way on how to successfully connect to Databricks from Azure Data Factory? Kamil Nowinski shows you how to do so in two different ways.
Azure Synapse Analytics
Azure Synapse Analytics provides several visualization capabilities: Built-in data visualization for Spark SQL query results, language-specific data visualization libraries & of course Power BI reporting integration. Fikrat Azizov describes each tool one of them
What exactly is SQL Serverless or SQL On-demand? Liliam Leme shares her experience in serverless architecture and concepts.
Azure Synapse Analytics has inherited most of its data integration and orchestration capabilities from Azure Data Factory. Fikrat Azizov covers the similarities and differences between the both of them.
SQL
SQL Server triggers are another tool in your DBA or Dev toolbox. Edward Pollack explains what can go wrong with triggers and how to correct those issues.
SQL Server tech interview questions must be well crafted to make sure the candidate actually knows the topic. In his article, Sergey Gigoyan provides his favorite questions for interviewing SQL Server developers.
Power BI
Following on his last two posts comparing the performance of importing data from ADLSgen2 into Power BI using the ADLSgen2 connector and going via Synapse Serverless, Chris Webb looks at a third option for connecting to CSV files stored in ADLSgen2: connecting via a Common Data Model folder.
Last month, the Power Query team announced the introduction of query folding indicators in Power Query Online. If you’re authoring a dataflow in Power BI or Power Apps, you will now see visual indicators to let you know which steps will fold, and which will not. Matthew Roche walks the talk!
Wouldn’t it be nice to have these best practices codified in a single place and alert you of a modeling issue as you are developing your model? Think of how ‘Spell Checker’ works in Microsoft Word. It notifies you of spelling or grammar mistakes while you are typing. Power BI’s Michael Kovalsky elaborates on how Tabular Editor’s Best Practice Analyzer can be most useful.
Self-service BI! This is what Power BI is all about for business users. They can connect to any available data source and start creating their own reports. However, great Power (BI) comes with great responsibility as per Uncle Ben and not sure business users are aware of their responsibilities when it comes to Power BI datasets. When number of datasets grows within a tenant and goes beyond a control, various issues can arise. Asanka Padmakumara shows you how to stop your Power BI tenant from becoming a dataset swamp.