Microsoft Fabric: A Simplified Data Analytics Platform

You might have heard the news of the announcement at Microsoft Build by Arun Ulag and the rest of the Microsoft Data and Analytics Platform team. The announcement was about the new service offering named Microsoft Fabric. In this article and video, I want to explain what Fabric is and why you should care. In the end, I will also share my two cents on how I see this offering will come into the Data analytics products and services market.

Microsoft Fabric- For Simplicity

The best way to understand Microsoft Fabric is to understand its primary purpose; Simplicity. Microsoft team invested in this new offering in the past two years and devised a way to simplify things. As the Data Analytics Lead of your organization, you don’t have to worry much about the technology; you can instead focus on the results. You don’t have to spend hours and hours. To figure out how the licensing of your Azure Synapse combined with Azure Data Factory and Power BI would work together. Fabric makes it much simpler.

Microsoft Fabric- An Umbrella

I like the Umbrella concept. Microsoft did it once in 2015 by bringing Power View, Power Query, and Power Pivot under an umbrella called Power BI. Power BI was a huge success in a way that in the past few years, Power BI always has been on the top of Gartner’s Magic Quadrant for BI services in the world.

Fabric is the Data Platform service offering of Microsoft for this new age. Fabric is an umbrella on top of Microsoft’s three main Data Analytics products: Power BI, Azure Data Factory, and Azure Synapse. However, it is easier to understand if you look at it by functionality or workload. Here are what is included in Fabric;

Microsoft Fabric

  • Storage; OneLake
  • Data Integration; Data Factory
  • Data Engineering; Synapse
  • Data Warehousing; Synapse
  • Data Science; Synapse
  • Real-Time Analytics; Synapse
  • Business Intelligence; Power BI
  • Action platform; Data Activator
  • Governance; Purview

When you have all the above under one umbrella. Then you would have one place to create and edit them. You would have one structure to tie them together (workspaces). One setup for security and configurations like that. A far simpler licensing plan that can be used for all of the above.

Storage: OneLake

OneLake is a Data Lake technology that emphasizes being the ONE data lake. This will be the storage for all the computing services mentioned above. They will all store data in the OneLake and read it from there. The idea behind using a Data Lake technology is that it would cover both types of structured and unstructured data. OneLake will automatically cover the regions through one tenant, so there won’t be a need to create a data lake for each region. One Data Lake would be enough for all, hence the OneLake.

Data Integration: Azure Data Factory and Dataflow

Microsoft has invested a long time in data integration technologies. Azure Data Factory is the successor technology of SSIS (SQL Server Integration Services). Azure Data Factory has the power to transfer billions and trillions of rows of data. Recent enhancements in Power Query technology also bring Dataflow as the transformation engine. That can now be used alongside Data Factory for a comprehensive Data Integration technology. Azure Data Factory is the ETL technology for a data professional. Whereas Data Flow and Power Query are usually the technology for the Data Analyst. In Trident, the experience of Data Integration would use the best of both worlds. It will give you the scalability and the transformation power in one place.

Data Engineering: Synapse

For data engineers, Synapse provides the ability to build the infrastructure using Lakehouse. And then pipelines to ingest the data into that structure. There will be connectors for various data sources, and the data will be stored in the Lakehouse as files or data tables, depending on the source type. The data can be moved into the Lakehouse using Shortcuts or Data integration methods mentioned in the previous section.

Data Warehousing: Synapse

When you work with a large-scale data warehouse, Synapse gives you immense power to manage that. You can query the data with an amazing, empowered performance using SQL technology combined with Apache Spark for big data. Azure Data Explorer (Kusto) can be used for interacting with this technology, and with the Fabric, now Kusto is part of the overall experience. You won’t need to use a separate tool or editor for it.

Microsoft Fabric Data Science: Synapse

Data Science projects are usually part of the bigger data analytics work. That is why in Microsoft Fabric, Data Science using Synapse is added as a workload. Data Science is not just using a single tool; it is a combination of features and tools used across the entire Fabric. The process can include using analyzing the data using Data Wrangler. Building models and experiments using MLFlow. Model training, usage of Cognitive Services and large language models. Moreover, prediction using PREDICT. Synapse ML would be supporting all these in Microsoft Fabric.

Microsoft Fabric
Microsoft Fabric

Microsoft Fabric Realtime Analytics: Synapse

The ability to analyze real-time data using IOT Analytics and Log Analytics has been part of Microsoft’s offering for a long time. This ability is now part of the Fabric as Synapse Real-time Analytics workload. Synapse Real-time Analytics works with event streaming technologies (such as IoT or Event Hubs, pipelines, etc.). Loading data into KQL DB and Lakehouse via mirroring and then ML models. To run experiments on it, and finally, use Power BI to see the results.

Power BI

It is hard to have not heard of Power BI in this age. Power BI is the most capable analytics technology that can connect to a wide range of data sources. The analytical power of this service enables data analysts to do data preparation, data modeling, and calculations. The visualization engine of Power BI then visualizes the insights to the users. Power BI works by itself in many data analyst solutions. However, in many other solutions. It works with other technologies such as Excel and Power Platform. With Fabric, it would work with OneLake storage. The coupling of Power BI and OneLake is more than just another data source. It comes with a new connection type called DirectLake. Which is faster than DirectQuery. I’ll explain this in another article.

Data Activator

Data Activator is the new tool that is offered as part of Fabric. This tool is a data-event-trigger system that helps automate actions based on data. For example, you might want to set a query to run if a certain measure’s value in a Power BI dataset goes above or below a particular amount. The idea for this service is to close the loop from Insights to Action.

Purview

Purview provides a solution to help govern, protect, and manage the data estate. The Data Catalog of Purview now would be able to scan the entire Microsoft Fabric artifact (not just Power BI). The Purview hub will be part of the Fabric portal. Users can browse and search for artifacts. Information protection and sensitivity labels will be part of Fabric elements. To help protect the data as the organization needs.

Conclusion

In conclusion, Micorsoft Fabric is a complete and unified data platform. It simplifies the complexities of data analytics and integration. It brings together key components like Power BI, Azure Data Factory, and Azure Synapse under one umbrella. Fabric aims to streamline workflows. It enhances performance. It also provides a more cohesive user experience.

Leave a Reply

Your email address will not be published. Required fields are marked *