Microsoft Fabric Updates Blog

Construct a data analytics workflow with a Fabric Data Factory data pipeline

Microsoft Fabric Data Factory provides an easy way to build low-code data integration and ETL projects for building cloud-scale data analytics. Today, I want to focus on data pipelines in Data Factory and the advantages you’ll find by using pipelines to orchestrate your Fabric data analytics projects and activities.

What is a data pipeline?

For Azure Data Factory and Azure Synapse users, data pipelines will be very familiar as we’ve had data pipelines in those products for many years. Now that Data Factory and data pipelines are available in the SaaS orientation of Fabric, you will find the experience to be nearly identical. However, if you are primarily a Power BI or Power Platform user, you may not have experience with data pipelines. So, today, I’d like to take a few minutes to explain what a data pipeline is.

In the context of Fabric data analytics, you will use a data pipeline to build automated workflows that combine the different artifacts in your workspace that you’ve created as a way to build your analytics. As an example, in the screenshot below, you can see that I’ve built a pipeline that performs the following tasks:

  1. Find files in a storage folder
  2. Iterate over the files found
  3. Copy each file contents to the bronze layer in my Lakehouse
  4. After the data has been loaded to bronze, run a Spark Notebook to transform the data and load it into the silver layer
  5. If the Notebook was successful, send an email to the team and continue
  6. If the Notebook failed, notify the team via a Teams channel and then fail the pipeline
  7. Execute a Dataflow to combine and clean data, preparing for gold layer
  8. Finally, issue a Copy command to load the cleaned data into the gold layer for reporting

Why would you use a data pipeline?

I created that pipeline design entirely in the web UI in Fabric without writing any code. Now I can set a schedule to automate the execution of my logic on a regular cadence from the designer UI when I click on the Schedule button. The frequency with which you update your Lakehouse will depend upon the business requirements and the frequency with which new data arrives at your sources.

Separately, inside of Fabric, I can create and manage those artifacts that I just orchestrated above. My Notebook is created and tested in the Data Engineering app, while I used the Data Factory app to create a Dataflow. So now I use Data Factory data pipelines in Fabric to bring them all together into a single cohesive logical “pipeline”. In other words, I just created an end-to-end workflow that I can run on a schedule, fully automated and additionally … now I can use the central Monitoring Hub feature in Fabric to watch the execution of my pipelines, Notebooks, Dataflows, etc. all from a single pane of glass:

So as you build your analytics project in Fabric, you’ll use data pipelines to piece those artifacts together into an automated workflow to keep your Lakehouse (and subsequently, your business reporting users) updated, refreshed, and cleaned.

How to get started

I hope that this gives you a sense of the value that data pipelines from the Data Factory app inside of Microsoft Fabric can bring to your data analytics projects. To get started, switch over to Data Factory in Fabric and choose New > Data Pipeline. You’ll land on the page in the below screenshot when you can being adding activities to the low-code design surface and begin building your own workflows!

Other resources

  • Join the Fabric community to post your questions, share your feedback, and learn from others.
  • Visit Microsoft Fabric Ideas to submit feedback and suggestions for improvements and vote on your peers’ ideas!
  • Check our Known Issues page for up to date on product fixes!

Have any questions or feedback? Leave a comment below!

Related blog posts

Construct a data analytics workflow with a Fabric Data Factory data pipeline

June 12, 2025 by RK Iyer

Introduction Whether you’re building analytics pipelines or conversational AI systems, the risk of exposing sensitive data is real. AI models trained on unfiltered datasets can inadvertently memorize and regurgitate PII, leading to compliance violations and reputational damage. This blog explores how to build scalable, secure, and compliant data workflows using PySpark, Microsoft Presidio, and Faker—covering … Continue reading “Privacy by Design: PII Detection and Anonymization with PySpark on Microsoft Fabric”

May 29, 2025 by Sindhu Bharadwaj

We’re excited to announce that Microsoft Fabric Data pipelines now support both On-Premises Data Gateway (OPDG) and Virtual Network (VNET) Gateway across a broader set of external activity types. What’s New? You can now securely connect to on-premises and network-isolated resources using OPDG or VNET Gateway for the following Fabric Data Factory pipeline activities allowing for secure … Continue reading “New pipeline Activities Now Support OPDG and VNET”