Microsoft Fabric Updates Blog

Sending data to Synapse Real-Time Analytics in Fabric from Apache Kafka Ecosystems using Java

Introduction

This quick start is based on the official Azure Event Hubs for Kafka example, adapted to work with Microsoft Fabric.

In this blog we will show you how to send data from Kafka to Synapse Real-time Analytics in Fabric. Get the sample source code from GitHub fabric-kafka-sample

You will use a Fabric Eventstream to receive data from Kafka, and then send it to a KQL-database for further processing. Eventstream supports Custom Apps that are backed by an Event Hub. This makes Fabric Eventstream compatible with Kafka, so you can use any Kafka client to send data to Fabric.

If you want to learn more about how Fabric supports Kafka, check out What is Azure Event Hubs for Apache Kafka.

Prerequisites

Before you begin, you will need access to an active Azure subscription. More details on how to activate a free trial account, visit: Microsoft Fabric (Preview) trial

In addition:

  • Java Development Kit (JDK) 1.7+
    • On Ubuntu, run apt-get install default-jdk to install the JDK.
    • Be sure to set the JAVA_HOME environment variable to point to the folder where the JDK is installed.
  • Download and install a Maven binary archive
    • On Ubuntu, you can run apt-get install maven to install Maven.
  • Git
    • On Ubuntu, you can run sudo apt-get install git to install Git.

Create an Eventstream

To create an Eventstream, you need to have a Fabric workspace.

Once you have your workspace, you can create an Eventstream by selecting the “Create” button, and then scrolling down until you see the Eventstream option. Full documentation on how to create an Eventstream can be found here.

Name your Eventstream “kafka-to-kql-eventstream” and select “Create.”

Create a Custom App

After you have created your Eventstream, select “New source,” and then select “Custom App.”

Creating a source called "from-kafka"

Name your custom app “from-kafka”.

New custom app "from-kaka"

Once the custom app is created, select it. You will see the EventHub connection string for your Eventstream in the Information panel at the bottom. Copy this Connection string-primary key, as you will need it later.

EventHub connection string

FQDN

For these samples, you will need the connection string as well as the FQDN that points to your Fabric Custom App. The FQDN can be found within your connection string as follows:

Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=XXXXXX;SharedAccessKey=XXXXXX;EntityPath=XXXXXX

Clone the example project

Now that you have your Custom App connection string, clone the Fabric Kafka Sample repository to your local machine:

git clone https://github.com/videlalvaro/fabric-kafka-sample 
cd fabric-kafka-sample/producer

The Data Producer

Using the provided producer example, send messages to the Fabric Eventstream service. To change the Kafka version, change the dependency in the pom file to the desired version.

We are going to send telemetry data to the Eventstream. The data is in JSON format, and looks like this:

{ 
"timestamp": "<timestamp>", 
"temperature": 25.0, 
"humidity": 13.0 
}

For our example, this data will be randomly generated by the producer.

Provide an Event Hubs Kafka endpoint

producer.config

Update the bootstrap.servers and sasl.jaas.config values in producer/src/main/resources/producer.config to direct the producer to the Custom Apps endpoint with the correct authentication.

bootstrap.servers=mynamespace.servicebus.windows.net:9093 
security.protocol=SASL_SSL 
sasl.mechanism=PLAIN 
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=XXXXXX;SharedAccessKey=XXXXXX;EntityPath=XXXXXX";

Run the producer from the CLI

This sample is configured to send messages to a Kafka topic that corresponds with your Custom App. In this case obtain the EntityPath value from the connection string since we provide that as a command line argument to the Producer class.

Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=XXXXXX;SharedAccessKey=XXXXXX;EntityPath=XXXXXX

To run the producer from the command line, generate the JAR and then run from within Maven (alternatively, generate the JAR using Maven, then run in Java by adding the necessary Kafka JAR(s) to the classpath):

mvn clean package 
mvn exec:java -Dexec.mainClass="TestProducer" \
-Dexec.args="<EntityIdValue>"

The producer will now begin sending events to the Fabric Custom App via the Kafka-enabled Event Hub.

From Eventstream to KQL-Database

Before starting to work with Eventstream, you need to create a KQL Database. This is the database that will hold the data that you send to the Eventstream.

Create the KQL Database

In your Fabric workspace, select New, and then choose “KQL Database” from the dropdown menu. Name your database “es-to-kql-database” and select “Create.”

Configure the Eventstream

Back in the Eventstream page, select the Eventstream in the center of the screen.

Select the Eventstream

Once you select it, you will see the preview table below being populated with your data:

Now, select “New destination”, and then select “KQL Database” from the dropdown menu.

New KQL Database destination

Name your destination “es-to-kql”. Enter your Workspace name, and finally name your KQL Database as “es-to-kql-database”. Then select “Create and configure.”

New destination KQL Database

Ingest the data to a KQL Table

Now it is time to configure the KQL Table that will hold our real-time data. Name the table “from-kafka-kql-table” and select “Next: Source.”

On the next screen, leave the default values and select “Next: Schema.”

Then on schema wait until the data is loaded in the preview. Then select the “Data format” dropdown and choose JSON, since that is the format of our data. You will see that the preview table automatically parses the data offering the different fields as columns with the right data types.

Select “Next: Summary.”

Now you will see the “Continuous ingestion from Eventstream established” screen. Select “Close” to finis the process of setting up your Custom App, ingesting data from Kafka, and sending it to a KQL Database.

Query the KQL Table

Close the previous confirmation screen. In the information pane at the bottom, you will see the “Status” marked as “Successful.” Below you will have a link to your KQL Database. Select “Open item.”

This will open a new tab with your KQL Database. Select your table in the Data tree pane, and then select “Explore your data.”

Type the following query in the query editor and select “Run”:

['from-kafka-kql-table'] 
| take 100

If all went well, you should see your query results like this:

From here on you can use the KQL Database as your building block for further processing, like generating Power BI reports. For more information on how to use KQL, check out the Kusto Query Language (KQL) overview documentation.

Conclusion

In this quick start you have learned how to send data from Kafka to Synapse Real-time Analytics in Fabric.

You learned how to create an Eventstream, that uses a Custom App as source for receiving data from Kafka.

You have also run your first KQL Query, and now are ready to explore the world of real-time analytics. Time to create your first Power BI report!

Keep learning

Related blog posts

Sending data to Synapse Real-Time Analytics in Fabric from Apache Kafka Ecosystems using Java

April 5, 2024 by Brad Watts

As part of the consumption model for KQL Database and Eventhouse, we are enabling public preview for Kusto Cache consumption. This change is scheduled to start rolling out on April 12. This means that you will start seeing billable consumption of the OneLake Cache Data Stored meter from the KQL Database and Eventhouse artifacts. For … Continue reading “Announcement: Kusto Cache Storage Public”

March 21, 2024 by Guy Reginiano

Users of Azure SQL can now take advantage of an enhanced monitoring solution for their databases and leverage integration with Microsoft Fabric. With the introduction of the new Database Watcher for Azure SQL (preview), users gain access to advanced monitoring capabilities. Through integration with Microsoft Fabric, they can effortlessly stream, store, and analyze monitoring data … Continue reading “Fabric Real-Time Analytics Integrates with Newly Announced Database Watcher for Azure SQL”