Recap of Data Factory Announcements at Fabric Community Conference Europe
Last week was such an exciting week for Fabric during the Fabric Community Conference Europe, filled with several product announcements and sneak previews of upcoming new features.
Thanks to all of you who participated in the conference, either in person or by being part of the many virtual conversations through blogs, Community forums, social media and other channels. Thank you also for all your product feedback and Ideas forum suggestions that help us defining the next wave of product enhancements.
We wanted to make sure you didn’t miss any of the Data Factory in Fabric announcements, by providing you with this recap of all new features.
General Availability announcements
- Copilot for Dataflow Gen2 [Announcement]
- Fast Copy for Dataflow Gen2 [Announcement]
- On-premises data gateway support for Data Pipeline [Announcement]
- Mirrored Database support for Snowflake [Announcement]
Public Preview announcements
- Incremental Refresh for Dataflow Gen2 [Announcement]
- New Azure Data Factory item in Fabric [Announcement]
- Copy Job item in Fabric [Announcement]
- Fabric User Data Functions support in Data Pipeline [Announcement]
- Invoke remote pipeline from ADF & Synapse [Announcement]
- Spark job environment parameters support for Data Pipeline [Announcement]
You can continue reading below for more information about each of these features.
Copilot for Dataflow Gen2
Copilot for Data Factory with Dataflow Gen 2 is now generally available, an experience that is part of Copilot in Fabric, where you can use natural language to transform data and generate code explanations to help you better understand previously generated queries and tasks.
Copilot in Fabric helps you enhance productivity, unlock profound insights, and facilitate the creation of custom AI experiences tailored to your data. As a component of the Copilot in Fabric experience, Copilot for Data Factory empowers you to use natural language to articulate your requirements for creating data integration solutions using Dataflow Gen2. Essentially, Copilot for Data Factory operates like a subject-matter expert (SME) collaborating with you to design your dataflows.
Fast Copy for Dataflow Gen2
Fast Copy in Dataflow Gen2 is now General Available! This powerful feature enables rapid and efficient ingestion of large data volumes, leveraging the same robust backend as the Copy Activity in Data pipelines.
With Fast Copy, you can experience significantly shorter data processing times and improved cost efficiency for your Dataflow Gen2. Additionally, it boosts end-to-end performance by allowing you to use Fast Copy to ingest data into staging, and then seamlessly transform it at scale using SQL DW compute.
Fast Copy supports numerous source connectors, including ADLS Gen2, Azure Blob Storage, Azure SQL DB, On-Premises SQL Server, Oracle database, Fabric Lakehouse, Fabric Warehouse, PostgreSQL, and Snowflake.
You can learn more about Dataflows Fast Copy here: Announcing the General Availability of Fast Copy in Dataflows Gen2 | Microsoft Fabric Blog | Microsoft Fabric
Data Pipelines accessing on-premises data using the On-premises data gateway
We are thrilled to announce the General Availability of on-premises connectivity for Data pipelines in Microsoft Fabric.
Using the On-premises Data Gateway, customers can connect to on-premises data sources using data pipelines with Data Factory in Microsoft Fabric. This enhancement significantly broadens the scope of data integration capabilities. In essence, by using an on-premises Data Gateway, organizations can keep databases and other data sources on their on-premises networks while securely integrating them and orchestrating them using data pipelines in Microsoft Fabric.
Mirrored Database support for Snowflake
Mirroring for Snowflake is now generally available, a frictionless way to add your entire Snowflake databases into OneLake data estate. Setting up Mirroring is trivial and simple. Once Mirroring starts the replication process, the mirrored data is automatically kept up to date at near real-time in OneLake. With your Snowflake data landed into OneLake, the data is now available everywhere in Fabric and ready to accelerate your data potential.
Learn more: Mirroring – Microsoft Fabric | Microsoft Learn
Incremental Refresh support for Dataflow Gen2
Incremental Refresh in Dataflow Gen2 is now in public preview! This powerful feature is designed to optimize your data processing by ensuring that only the source data that has changed since the last dataflow refresh is updated. This means faster dataflow refreshes and more efficient resource usage.
Incremental refresh settings can be configured for each of your queries via the contextual menu (right-click) on each of your queries in the Queries pane.
Key benefits of leveraging Incremental Refresh for your dataflows include:
- Efficiency: Only the source data that has changed since the last refresh is processed, saving time and resources.
- Performance: Faster dataflows due to reduced data processing and parallelism.
- Scalability: Handle large datasets more effectively by processing data in smaller, more manageable chunks.
Learn more: Incremental refresh in Dataflow Gen2 – Microsoft Fabric | Microsoft Learn
New Azure Data Factory item in Fabric
Bring your existing Azure Data Factory (ADF) to your Fabric workspace!
We are introducing a new preview feature that allows you to connect to your existing ADF factories from your Fabric workspace. You will now be able to fully manage your ADF factories directly from the Fabric workspace UI! Once your ADF is linked to your Fabric workspace, you’ll be able to trigger, execute, and monitor your pipelines as you do in ADF but directly inside of Fabric.
Watch the feature in action in the following video:
Learn more: Bring Azure Data Factory to Fabric – Microsoft Fabric | Microsoft Learn
Copy Job item in Fabric
We are happy to announce the preview of Copy Job in Data Factory, elevating the data ingestion experience to a more streamlined and user-friendly process from any source to any destination. Now, copying your data is easier than ever before. Copy job supports various data delivery styles, including both batch copy and incremental copy, offering the flexibility to meet your specific needs.
With Copy Job, you can enjoy the following benefits:
- Simplicity: Seamless experience data copying with no compromises, making it easier than ever.
- Efficiency: Enable incremental copying effortlessly, reducing manual intervention
- Flexibility: Take full control of your data copying
- Highly performant and scalable: Move Petabyte-scale data.
Learn more: What is Copy job (preview) in Data Factory – Microsoft Fabric | Microsoft Learn
Fabric User Data Functions support in Data Pipeline
Data pipelines in Fabric provide a simple interface to create and manage large data processing tasks by using Activities. Activities are the fundamental object that represents each step of a data processing task. Users can leverage several interconnected Activities to create large, elaborate data processing solutions.
Through a private preview, User Data Functions is now available as an Activity, allowing users to create custom code processing steps for their Data pipelines. Within the private preview, you can find them by going to Activities and selecting the Functions activity. After the Functions Activity is inserted in the Data pipeline, you will see the option to use Fabric User Data Functions in the settings tab.
Invoke remote pipeline from ADF & Synapse
We’ve been working diligently to make the very popular Data pipeline activity known as “Invoke Pipeline” better and more powerful. Based on customer feedback, we continue to iterate on the possibilities and have now added the exciting ability to call pipelines from Azure Data Factory (ADF) or Synapse Analytics pipelines as a public preview!
This creates countless possibilities to utilize your existing ADF or Synapse pipelines inside of a Fabric pipeline by calling it inline through this new Invoke Pipeline activity. Use cases that include calling Mapping Data Flows or SSIS pipelines from your Fabric Data pipeline will now be possible as well.
Spark Job environment parameters support for Data Pipeline
One of the most popular use cases in Fabric Data Factory today is automating and orchestrating Fabric Spark Notebook executions from your Data pipelines. A common request has been to reuse existing Spark sessions to avoid any session cold-start delays. We’ve delivered on that requirement by enabling “Session tags” as an optional parameter under “Advanced settings” in the Fabric Spark Notebook activity! Now you can tag your Spark session and reuse the existing session using that same tag to reuse an existing session and greatly reduce the overall processing time of your Data pipelines.
Thank You for your feedback, keep it coming!
We wanted to thank you for your support, usage, excitement, and feedback around Data Factory in Fabric. We’re very excited to continue learning from you regarding your Data Integration needs and how Data Factory in Fabric can be enhanced to empower you to achieve more with data.
Please continue to share your feedback and feature ideas with us via our official Community channels, and stay tuned to our public roadmap page for updates on what will come next: