Microsoft Fabric Updates Blog

Create Metadata Driven Data Pipelines in Microsoft Fabric

Metadata-driven pipelines in Azure Data Factory and Synapse Pipelines, and now, Microsoft Fabric, give you the capability to ingest and transform data with less code, reduced maintenance and greater scalability than writing code or pipelines for every data source that needs to be ingested and transformed. The key lies in identifying the data loading and transformation pattern(s) for your data sources and destinations and then building the framework to support each pattern.

I recently posted 2 blogs about a Metadata driven pipeline solution I created in Fabric.

A screenshot of a computer

Description automatically generated

Features include:

  • Metadata driven pipelines
  • Star schema design for Gold layer tables
  • Source data loaded into Fabric Lakehouse with Copy Data
  • Incremental loads and watermarking for large transaction tables and fact tables
  • 2 patterns for Gold layer
    • Fabric Lakehouse loaded with Copy Data activities and Spark notebooks
    • Fabric Data Warehouse loaded with Copy Data activities and SQL Stored Procedures

Why two options for the Gold layer? If you want to use T-SQL Stored Procedures for transformations, or have existing Stored Procedures to migrate to Fabric, Fabric Data Warehouse may be your best option, since it supports multi-table transactions and INSERT/UPDATE/DELETE statements. Comfortable with Spark notebooks? Then consider Fabric Lakehouse, which has the added bonus of Direct Lake connection from Power BI.

Check out the posts below to learn about building Metadata Driven Pipelines in Microsoft Fabric!

Part 1 – Metadata Driven Pipelines for Fabric with Lakehouse as Gold Layer

Part 2 – Metadata Driven Pipelines for Fabric with Data Warehouse as Gold Layer

Publicações de blogue relacionadas

Create Metadata Driven Data Pipelines in Microsoft Fabric

outubro 22, 2024 por Estera Kot

We’re thrilled to announce that Fabric Runtime 1.3 has officially moved from Public Preview to General Availability (GA). This is a major upgrade to our Apache Spark-based big data execution engine, which powers both data engineering and data science workflows. Fabric Runtime 1.3 is now fully integrated into the Fabric platform, ensuring a smooth and … Continue reading “Fabric Runtime 1.3 is Generally Available! Upgrade your data engineering and science workloads to harness the latest innovations and performance enhancements”

outubro 15, 2024 por Someleze Diko

This session is part of the Microsoft Fabric and AI Learning Hackathon which focuses on how you can leverage Copilot in Microsoft Fabric. It will guide you through the various capabilities that Copilot offers for you to use Microsoft Fabric, empowering you to enhance productivity and streamline your workflows. We will dive deep into practical … Continue reading “Microsoft Fabric and AI Learning Hackathon: Copilot in Fabric”