Outperform Spark with Python Notebooks in Fabric

22/10/2025

Abstract

When Microsoft Fabric was released, it came with Apache Spark out of the box. Spark’s ability to work with more programming languages opened up possibilities for creating data-driven and automated lakehouses. On the other hand, Spark’s primary feature to scale out and handle large amounts of data will, in many cases, be over-dimensioned, less performant, and more costly when working with trivial workloads.

With Python Notebooks, we have a better tool for handling metadata, automation, and processing of more trivial workloads, while still having the option to use Spark Notebooks for handling more demanding processing.

We will cover:

* The difference between Python Notebooks and a Single Node Spark cluster, and why Spark Notebooks are more costly and less performant with certain types of workloads.
* When to use Python Notebooks and when to use Spark Notebooks.
* Where to use Python Notebooks in a meta-driven Lakehouse
* A brief introduction to tooling and move workload between Python Notebooks and Spark Notebooks.
* How to avoid overload the Lakehouse tech stack with python technologies.
* Costs

Speakers

Christian Henrik Reich

Principal Architect @ twoday Data & AI

Currently work @ twoday’s Data & AI DK’s department for Technologies and architecture, and is a part of Mugato as a senior developer and AI developer. Started programming as kid, and still do. Have made everything from embedded programming to data warehouses. Last decade, focus has mainly been on data. From optimizing and infrastructure to designing and building data solutions in cloud and on-premise.