Microsoft Fabric Updates Blog

Wondering how to incrementally amass data in your data destination? This is how!

In a lot of scenarios, you want to only get new data from your sources and append it to your data destination to report over. With Dataflows Gen2 that comes with support for data destinations, you can setup your own pattern to load new data, replace some old data and keep your reports up to date with your source data.

Screenshot of the Power Query Diagram View showcasing three queries

The pattern is simple, you create a dataflow that loads data from your source and appends it to your data destination. You then create a pipeline that runs this dataflow on a schedule. This way you can keep your data destination up to date with your source data. The key in this scenario is that you retrieve only the new data from your source. This can be done by getting the latest timestamp from your data destination and use that to filter the data from your source. This way you only get the new data from your source and append it to your data destination.

Screenshot of the diagram view inside of Data Pipelines where a notebook, a dataflow and Teams activities are being used

To replace data you loaded previously you can leverage a fabric notebook to run a query on your data destination to delete the data you want to replace. You can then run your dataflow to append the new data to your data destination. Within the pipeline you can first run the fabric notebook to delete the data and then run the dataflow to append the new data. This way you can replace data in your data destination and keep your reports up to date with your source data.

We have created a documentation page that explains this pattern in more detail and provides you with the code to get started. You can find the documentation page here: https://learn.microsoft.com/fabric/data-factory/tutorial-setup-incremental-refresh-with-dataflows-gen2

We hope this helps you to get started with incrementally amass data with Dataflows Gen2. We are developing a feature that would introduce a native incremental refresh feature in Dataflows Gen2. This has been one of our top voted ideas on the ideas website. Vote for it here: https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=4814b098-efff-ed11-a81c-6045bdb98602

Kapcsolódó blogbejegyzések

Wondering how to incrementally amass data in your data destination? This is how!

június 24, 2024 Készítette Justin Barry

When we talk about Microsoft Fabric workspace collaboration, a common scenario is developers and their teams using a shared workspace environment, which means they have access to “live items”. A change made directly within a workspace would override and affect all other developers or users utilizing that workspace. This is where git becomes increasingly important … Continue reading “Microsoft Fabric Lifecycle Management: Getting started with development in isolation using a Private Workspace”

június 21, 2024 Készítette Marc Bushong

Developing ETLs/ELTs can be a complex process when you add in business logic, large amounts of data, and the high volume of table data that needs to be moved from source to target. This is especially true in analytical workloads involving relational data when there is a need to either fully reload a table or incrementally update a table. Traditionally this is easily completed in a flavor of SQL (or name your favorite relational database). But a question is, how can we execute a mature, dynamic, and scalable ETL/ELT utilizing T-SQL with Microsoft Fabric? The answer is with Fabric Pipelines and Data Warehouse.