Microsoft Fabric Updates Blog

Semantic link in Microsoft Fabric: Bridging BI and Data Science

We are pleased to introduce the Public Preview of semantic link, an innovative feature that seamlessly connects Power BI datasets with Synapse Data Science within Microsoft Fabric. As the gold layer in a medallion architecture, Power BI datasets contain the most refined and valuable data in your organization. With semantic link, we unlock this data’s potential beyond traditional business intelligence by making it accessible to notebooks and Python in Microsoft Fabric.

Python has emerged as the go-to language for state-of-the-art machine learning and boasts a vast ecosystem of libraries for a wide range of tasks, including rich visualizations, statistical analysis, and data validation. By bridging this gap, we aim to empower business analysts to utilize modern data tools with their data, enable Power BI developers to streamline automation tasks, and facilitate seamless collaboration with data scientists.

Semantic link supports the popular pandas and Spark APIs, making it easy to join existing data and apply common libraries. You can compute Power BI measures, read tables, and execute DAX queries. Semantic link goes beyond plain data connectivity by propagating semantic information from Power BI to power new capabilities of Microsoft Fabric for data augmentation, validation and exploration, as well as an extendable set of semantic functions.

Empowering Insights: Mapping the Journey from
Power BI to OneLake with semantic link

In this blog post, we’ll showcase semantic links capabilities to access Power BI datasets.

Use semantic link to bring your Power BI data to pandas

Semantic link offers easy to use Python methods for pandas users to discover and read data:

The following code snippets show how to install the python library in Microsoft Fabric and evaluate Power BI measures. The resulting FabricDataFrame is a semantically aware pandas dataframe – with all its functionality – while providing additional features like semantic propagation and semantic functions. Note that this sample assumes that the Power BI dataset “Customer Profitability Sample” is accessible in the Fabric workspace.

Package installation and sample code for evaluate_measure.

To make your adventures into notebooks even easier, you can use the %%dax cell magic to execute DAX. The sample below queries a Dynamic Management View (DMV) and its output is available in the _ variable for further analysis using Python (see output caching). All underlying requests are run on low-priority, making sure that your production workload is not impacted.

%%dax cell magic loading and usage.

Use semantic link to bring your Power BI data to Spark

Spark users can access Power BI data from all languages supported in Fabric: Python, R, and SparkSQL using the semantic link Spark native connector. Configure the Power BI catalog to gain access to all your datasets. In this example we evaluate a measure using the special _Metrics table. All other tables are accessible using e.g. “pbi.`Customer Profitability Sample`.Customer” and ready to be combined with other Spark data sources.

Configuration and usage of the Spark native connector for Power BI datasets.

Use semantic propagation for data augmentation

Semantic links Python API returns FabricDataFrame when accessing Power BI data to enable data augmentation and semantic functions. Here’s a brief example on how you can augment an existing dataframe with Power BI data. Instead of computing the measure for a set of dimensions, joining the data frame and filtering it, the add_measure function simplifies the operation by matching the columns to the Power BI dataset – here Customer[Country/Region] and Industry[Industry] – to compute the measures Total Revenue and Total COGS at these levels and automatically adding them.

Data augmentation using add_measure.

Discover semantic functions with intelligent code auto-completion

Semantic functions enable intelligent auto-complete by matching function parameters with column metadata. For example, the to_geopandas function provides suggestions to bind the  lat_col and long_col parameters to the latitude and longitude columns based on Power BI data categories.

Semantic function parameter auto-complete.

A semantic function is a regular Python function, exposed on FabricDataFrames and accompanied with metadata to enable intelligent auto-completion. While semantic link provides a few semantic functions available on GitHub, you can define your own semantic functions using Python decorators. The @semantic_function decorator applied on the _is_capatial function makes it available for intelligent code auto-completion.

Explore and validate data in Power BI from Python

Ensuring data quality is a crucial task and semantic link provides tools to support this. In this example we visualize existing relationships defined in your Power BI dataset.

Visualizing Power BI dataset relationships using list_relationships.

To understand the data in even more detail, the find_dependencies and plot_dependencies_metadata methods help you understand and visualize functional dependencies present in your data:

Functional dependencies detected using find_dependencies function.

To learn even more about data validation and exploration visit our docs.

Get coding!

In summary, semantic link is a powerful tool that enables business analysts and data scientists to use data effectively in a comprehensive data science environment. By using semantic link, you can:

  • Eliminate duplicated business logic by empowering data scientists to directly access your semantic model in Power BI datasets
  • Do even more with semantic information present in Power BI datasets using semantic functions, data augmentation, validation and exploration.

Hope you find semantic link useful, and we welcome your feedback and suggestions. To try semantic link follow our how-to guides. We’d love to hear your feedback in the comments, and Fabric ideas!

Entradas de blog relacionadas

Semantic link in Microsoft Fabric: Bridging BI and Data Science

octubre 31, 2024 por Jovan Popovic

Fabric Data Warehouse is a modern data warehouse optimized for analytical data models, primarily focused on the smaller numeric, datetime, and string types that are suitable for analytics. For the textual data, Fabric DW supports the VARCHAR type that can store up to 8KB of text, which is suitable for most of the textual values … Continue reading “Announcing public preview of VARCHAR(MAX) and VARBINARY(MAX) types in Fabric Data Warehouse”

octubre 29, 2024 por Dandan Zhang

Managed private endpoints allow Fabric experiences to securely access data sources without exposing them to the public network or requiring complex network configurations. We announced General Availability for Managed Private Endpoint in Fabric in May of this year. Learn more here: Announcing General Availability of Fabric Private Links, Trusted Workspace Access, and Managed Private Endpoints. … Continue reading “APIs for Managed Private Endpoint are now available”