By Bence Balázs
•
11 Jun, 2024
Why do companies need a data platform? In today's data-driven world, a data platform is essential for companies looking to harness the full potential of their information. Data platforms streamline the integration, processing, and analysis of vast data sets, enabling businesses to make informed decisions quickly. By leveraging a data platform, companies can enhance operational efficiency, drive innovation, and gain a competitive edge in their respective industries. What is Fabric and what makes it special? Fabric is an all-in-one data platform designed to meet the needs of modern enterprises. It integrates data management, analysis, and application development into a single, cohesive, unique framework. This consolidation allows organisations to streamline their data operations, from ingestion to insights, eliminating the need for multiple disjointed tools and giving them a single source of data truth. Fabric provides a versatile platform that addresses the full spectrum of data management needs, from integration to analytics and everything in between the following rather impressive feature list: Data Integration: Fabric is your hub for data transformation and integration. With an ever growing number of connectors you can get data from anywhere and transform it exactly the way you want it. Data Engineering: Fabric supports robust data engineering tools designed to prepare and transform data for analytical readiness. This includes automation of data pipelines and scaling of data processing workflows, which are essential for maintaining data integrity and relevance. It also features Lakehouse concept which makes working with large amounts of data possible with great collaboration opportunities Data Warehousing: At the core of its architecture, Fabric incorporates a powerful data warehouse solution, optimising data storage for efficient querying and reporting. This centralised repository enables complex data analysis and supports the decision-making process. No matter how big your dataset is, Synapse Data Warehouse can handle it! Business Intelligence With PowerBI, the most used BI solution which feels for many users as “Excel on steroids” there is a big group of powerful stakeholders waiting to adopt this solution and turn data into insights. Real-Time Analytics: With Fabric, businesses can leverage real-time analytics to gain instantaneous insights from their data. Whether it's IoT data or logs, you get the insights you need, quickly and accurately. Data Science: Last but not least, Fabric has tools for the real data scientists. "Fabric extends its capabilities to data science, offering tools and environments that foster the development of predictive models and advanced analytics. Here you'll find everything you need to build, train, and deploy advanced AI models. OneLake: OneLake is Fabric's integrated data lake solution, designed to handle large volumes of diverse data in its native format. This is the central layer of Microsoft Fabric. This is where all your data is brought together and organised, so you can easily discover insights that would otherwise remain hidden. You also have the option to make this data available via separate data warehouses and lakehouses in different workspaces with specific security and policies. You get one OneLake per tenant, with data in different containers. Each OneLake can be split into multiple workspaces with their own access rules, so each team can manage its own data. Furthermore, you can host and explore all kinds of files with the different workload tools. Even more handy functionality is that you can use shortcuts to reference other storage locations. These shortcuts allow you to work with data without hosting it in Azure, reducing the risks of copying data. Unique feature: Warehouse VS Lakehouse Fabric adeptly bridges traditional data warehouse capabilities with the scalability of data lakes, forming a hybrid 'lakehouse' architecture. This integration offers the structured query capabilities of a data warehouse, combined with the vast data handling and machine learning readiness of a data lake. While both data warehouses and lakehouses serve as critical repositories for organisational data, they cater to different needs and use cases. A data warehouse is highly optimised for fast querying and streamlined reporting of structured data, primarily through SQL. It excels in scenarios where stability, data quality, and quick access to processed data are paramount. On the other hand, a lakehouse combines the robust querying capabilities of a data warehouse with the flexibility of a data lake. It is designed to handle not only structured but also semi-structured and unstructured data, supporting a wider variety of data formats like CSV, JSON, Parquet, and Delta. This makes the lakehouse ideal for more extensive data science and machine learning projects that benefit from large, varied datasets and require more complex data processing capabilities. Thus, the choice between a data warehouse and a lakehouse typically depends on the specific data strategies and analytical demands of the organisation. Fabric use case: NZA x Dataops House Challenge: New Zealand Auckland was a perfect, fertile ground for Fabric. The data of NZA comes from multiple sources, the processing of this data was handled mainly via non-automated, manual processes which is not only time consuming and prone to errors, but also declines scalability and having one single source of data truth.Because the business already adopted PowerBI, data literacy was on a medium level, they already use Microsoft as a partner and the amount of data limited to 40 stores and 1 webshop, Fabric stood out as the best choice. Solution: The implementation was a collaborative effort between DataOps House and NZA. Together, we strategically harnessed Microsoft Fabric, utilising its robust data pipelines and containerization capabilities. After careful planning, we started really from the ground up by creating the environment, we developed an innovative solution that automated the process of navigating the SFTP folder, ingesting data daily, applying transformations via Fabric DataFlows, centralising Fabric warehouse with automatically updated tables that are ready to be consumed and fuel dashboards, while taking an advantage of PowerBI capabilities of Fabric also. Result: guaranteed availability of complete and up-to-date PowerBI dashboard for daily decision making. Delivered components: Azure Landing Zone and MS Fabric Setup: Azure Landing Zone and MS Fabric Setup: NZA received comprehensive assistance in setting up their Azure subscription and configuring Microsoft Fabric within their environment. This included guidance on account provisioning, subscription management, and fine-tuning Microsoft Fabric to seamlessly integrate with NZA's existing infrastructure. Fabric Capacity Configuration: Additionally, DataOps House provided support in configuring the Fabric capacity, ensuring optimal performance and scalability for NZA's data operations. This involved fine-tuning the capacity settings to align with NZA's data processing requirements and future growth plans. DataOps House handled end-to-end data infrastructure setup Including pipeline development, workspace deployment, table creation, data modelling, and job scheduling