Microsoft Fabric Updates Blog

Using Data pipelines for copying data to/from KQL Databases and crafting workflows with the Lookup activity

AUTHOR: Guy Reginiano, Program Manager

In today’s data-driven landscape, the ability to capture, analyze, and visualize vast amounts of real-time data from diverse sources is crucial for making informed decisions and gaining a competitive edge. Synapse Real-Time Analytics in Microsoft Fabric offers a comprehensive solution to this challenge. Its seamless integration with Data factory in Microsoft Fabric allows you to use multiple sources and destinations to build data movement and data transformation solutions. In this blog post, we’ll delve into the integration of Data factory’s Data pipeline and KQL Database within the Fabric ecosystem.

Data Factory: Your Gateway to Connectivity

Data factory in Fabric empowers you with seamless connectivity to more than 170+ data stores, including on-premises data sources, cloud databases, analytical platforms, line of business applications, and more.

A pivotal feature within the Data factory experience is the Data pipeline—an evolution of Azure Data factory pipelines. This enhanced version equips you with a rich set of integration capabilities, allowing you to seamlessly move and transform data across different sources and destinations. One of the standout features of Data pipelines is the Copy Assistant, a tool that accelerates your data copying process by providing a guided and streamlined experience.

Copy Assistant: Simplifying Data Copying

The Copy Assistant within Data pipelines is designed to jumpstart your data copying journey through a user-friendly, guided process. Whether you’re a seasoned data professional or just starting out, the Copy Assistant simplifies the process of connecting to various data sources and destinations.

When you complete the steps of the Copy Assistant, a Copy activity will be created in your Data pipeline and shown in the canvas.

KQL DB as a data source in Data pipeline's Copy Assistant
KQL DB as a data source in Data pipeline’s Copy Assistant


Numerous Data Sources and Destinations

The Copy Assistant doesn’t just offer simplicity—it also presents an extensive range of data sources and destinations to choose from. These include sample data sources, such as the NYC taxi data, Fabric data sources like KQL DB, Data Warehouse, and Lakehouse, Azure data sources like Azure Blob Storage, Azure Data Explorer, and Cosmos DB. Additionally, the assistant allows integration with external data products like Amazon Redshift, PostgreSQL, and Spark, along with files from Amazon S3 and Google Cloud Storage. Services like Snowflake and many others also find a place within this comprehensive spectrum.

KQL DB as a destination in Data pipeline's Copy Assistant
KQL DB as a destination in Data pipeline’s Copy Assistant


KQL DB and Azure Data Explorer Integration

Real-Time Analytics’ KQL DB is supported as both a destination and a source, allowing you to build and manage various extract, transform, and load (ETL) activities, leveraging the power and capabilities of KQL DBs with the rich coinfections eco system of Data pipelines. Additionally, the Azure version of KQL DBs, Azure Data Explorer clusters, is also supported, enabling you to integrate Fabric capabilities with the Azure-based offering.

KQL DB as a destination in Data pipeline's 'Copy data' activity
KQL DB as a destination in Data pipeline’s ‘Copy data’ activity

Lookup Activity

In addition to the Copy activity, there’s another valuable activity at your disposal when working with KQL DBs: the Lookup activity. This activity is specifically designed for executing queries and fetching datasets from KQL DBs, Azure Data Explorer, or any other data source supported by Data pipeline’s Lookup activity. The result the query will be returned as the output of the Lookup activity, and can be used in the next activity in the pipeline. You can use it to dynamically determine which objects to operate on in a subsequent activity. The output  can be consumed in a subsequent copy, transformation, or control flow activities like ForEach activity.

Lookup activity with KQL DB
Lookup activity with KQL DB


Summary

The integration of KQL DBs and Data pipelines offers a streamlined approach to collecting, processing, storing, and analyzing data from various sources. This integration breaks down data silos and ensuring consistency across sources. KQL DB’s strong querying capabilities enable users to extract insights effectively, promoting in-depth data exploration. This integration enhances operational flexibility and empowers Fabric users to reveal concealed patterns, trends, and relationships within their data. By combining KQL DBs with a Data pipeline, Fabric users can tap into their data’s full potential, fostering innovation and gaining a competitive advantage in an increasingly data-centric landscape.

For more information, visit the documentation.

Kapcsolódó blogbejegyzések

Using Data pipelines for copying data to/from KQL Databases and crafting workflows with the Lookup activity

június 24, 2024 Készítette Justin Barry

When we talk about Microsoft Fabric workspace collaboration, a common scenario is developers and their teams using a shared workspace environment, which means they have access to “live items”. A change made directly within a workspace would override and affect all other developers or users utilizing that workspace. This is where git becomes increasingly important … Continue reading “Microsoft Fabric Lifecycle Management: Getting started with development in isolation using a Private Workspace”

június 21, 2024 Készítette Marc Bushong

Developing ETLs/ELTs can be a complex process when you add in business logic, large amounts of data, and the high volume of table data that needs to be moved from source to target. This is especially true in analytical workloads involving relational data when there is a need to either fully reload a table or incrementally update a table. Traditionally this is easily completed in a flavor of SQL (or name your favorite relational database). But a question is, how can we execute a mature, dynamic, and scalable ETL/ELT utilizing T-SQL with Microsoft Fabric? The answer is with Fabric Pipelines and Data Warehouse.