In addition to building integration solutions we have also been building a data platform powered by the Microsoft Azure technologies such as Azure Synapse. As part of this platform we have the following 2 key use cases for users to be able to interact with our data platform.
- A user will upload a file to SharePoint and we want it to be imported into the Data Lake and trigger a Synapse pipeline to start processing the data
- When a Synapse pipeline has completed processing data we want to be able to export small/medium datasets and host them in SharePoint so its easy for users to interact with the data.
These use cases are about manual import and export scenarios for your data platform. It is possible to just upload files directly to the data lake via the Azure tools, but the Azure Portal, storage explorer and other tools to interact with the Data Lake are not ones that business users would be familiar.
From an export perspective users could view a report in Power BI and download the data set on demand but this only fits half of the requirements.
We decided that we could create a SharePoint site for our Data Platform and then we are in the SharePoint eco-system which business users would be comfortable with and we could have 2 document libraries:
- To Data Platform = Users add documents in this library and we will replace these files to the data lake
- From Data Platform = This document library will be used to export data from Synapse and the Data Lake and replicates it to SharePoint
From a Logical Perspective this solution would look like the below diagram.
This solution is pretty simple to implement and I will talk about the specifics in an upcoming post, but the things this architecture enables is things like users being able to upload existing data sets which the business has within excel and csv type files which can then be imported into the data lake which allows them to participate in our data processing.
In terms of export we have things like reference data sets which are created and then aggregated datasets which we want to allow users to play around with in excel which we will export and then business users can use them to experiment and feed new requirements to our data platform. While many data scientists and data analysts are comfortable with some of the common data tools, making the data available in excel to business users allows business domain experts to experiment in a citizen data analyst use case without needing any of the advanced skills that are needed with Power BI or Data Lakes.