Ive been playing around with Synapse a bit and wanted to explore options for using synapse from Logic Apps. This is a walk through of how you can query data from Synapse to use it within an integrated process in your Logic Apps.
Before you get into this I am assuming you have already got a synapse workspace setup, you have got a storage account setup as your datalake which is setup as a linked resource and you are going to put a file in your datalake storage which will be the test data we will use for this post. Im also assuming your managed identity for synapse has access to the storage account.
In Synapse workspace you can see I have a serverless sql database called Test and the picture shows my view which we will walk through setting up.
Add Test Data to the Datalake
In one of the Microsoft samples they have some examples involving population data from Slovenia. Im using that file as the example. Below is the format.
There is a walk through on Microsoft’s pages where they cover some of this but I found if a bit painful to get going so hopefully this will help clarify some of their bits. – https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-sql-on-demand
In my storage account I have uploaded the file to a container.
I can now see this file in Synapse as a linked datalake and I can see my file like we did in the storage account above.
Create your Credential for the External Data Source
In this post to keep it simple to get started I am going to access the file in storage under the context of the managed identity for the synapse workspace.
There are a number of other options available such as shared access keys and pass through AD. Those are out of the scope of what I want to cover here. We will use the managed identity for the workspace and manage access to the view in Synapse.
The below script will setup my managed identity within my test database as a scoped credential I can reference.
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity WITH IDENTITY = 'Managed Identity'
Create External Data Source
Next I need to create an external data source in Synapse which will reference my WorkspaceIdentity credential which will access the folder with my population data in it.
CREATE EXTERNAL DATA SOURCE [HelloWorld] WITH ( LOCATION = 'https://blogmsdemodl.dfs.core.windows.net/blogmsdemofs/Raw/helloworld', CREDENTIAL = [WorkspaceIdentity] )
Note i did have a bunch of pain here when I got the casing wrong for my url (notice Raw not raw). Just watch out for that.
Create View in Synapse
Next up I created a view in synapse which points to my test data csv file.
CREATE VIEW populationView AS SELECT * FROM OPENROWSET( BULK 'SloveniaTestData.csv', DATA_SOURCE = 'HelloWorld', FORMAT = 'CSV', FIELDTERMINATOR =',', ROWTERMINATOR = '\n' ) WITH ( [country_code] VARCHAR (5) COLLATE Latin1_General_BIN2, [country_name] VARCHAR (100) COLLATE Latin1_General_BIN2, [year] smallint, [population] bigint ) AS [r];
Check you can Query the View in Synapse
Its probably a good idea to check you can query your view in the SQL view of synapse before you use it in a logic app. Lets do that here to make sure the view works.
Select * From [dbo].[populationView]
Setup Service Principal for Logic App
In order to query the view from Logic Apps I am going to use the SQL Connector with a Service Principal. The service principal is called “logicapp-synapse-connector“.
I need to get the client id and secret for use in configuring the logic app connector. Im assuming here your already familiar with creating an App Registration in Azure AD but if not there is a walk through here: https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal
Setup Synapse Access for Logic App
Back in Synapse we need to register a login for our user which points to Azure AD for authentication and also create a database user for it. Use the below SQL.
CREATE LOGIN [logicapp-synapse-connector] FROM EXTERNAL PROVIDER CREATE USER [logicapp-synapse-connector] FROM LOGIN [logicapp-synapse-connector]
Grant the Logic App Service Principal access to use the scoped credential
In the steps above I used the managed identity for connecting the view to the storage so it runs as the synapse managed identity under the hood. If you take this approach out of the few that are available then you need to assign the user who will run the view permission to reference the database credential that will be used to access the storage. To do this you use the below SQL in Synapse to provide this permission.
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::WorkspaceIdentity to [logicapp-synapse-connector];
Grant the logic app Service Principal access to query the view
Next up we need to grant permission for the logic app service principal user to access the view. Run the below sql to do that.
GRANT SELECT ON OBJECT::[dbo].[populationView] TO [logicapp-synapse-connector]
Logic App Connector Setup
Next we will go to our logic app where we will create a connection to the synapse workspace via the SQL connector. Adding a SQL action for get rows we will add the below connector settings and we will use the client id and secret from the service principal.
On the Get Rows (v2) action we will supply the database server and database name and the view name as the table.
The key thing to note here is the server name comes from your Synapse workspace overview page in the Azure Portal and you want the serverless SQL endpoint.
Run Logic App to Test
Back in my Logic App I am simply wanting to test connectivity and basic data retrieval. I run it with an HTTP action then query the view like below.
At runtime you can see in the history I have now got the results from Synapse just like from any other SQL query.
The designer for the SQL Connector picks up the definition for the view so you can see I automatically get the designer properties too like country_code below so I can loop over the results and use the fields as normal.
I think the combination of using Synapse along side Logic Apps is pretty interesting. Synapse can provide that centralized data warehouse / data hub which you can use as a data source for data you want to integrate with other systems. Obviously Synapse also comes with a bunch of other features such as Data Flows which give you options for bulk pushing and pulling of data into Synapse but that just means you have many options in your tool kit. Synapse as the central data source to consolidate views of data is one aspect and then the other aspect is a choice of Data Flows or Logic Apps, which is really just the same decision about if you use Data Factory or Logic Apps for an interface which is really down to the requirements of that specific interface and about data integration vs process integration.