In our recent articles, we’ve addressed various ways IT leadership can help their companies to become data-driven organizations. We’ve looked at topics such as migrating data to a cloud-based warehouse or data lake, how data can be leveraged to generate insights, and how those insights can be used to automate business processes through machine learning and AI.
We will now take a step back and look at how the different stages of data management need to be integrated to enable genuine business efficiency and innovation. We’re talking about a new approach to data integration known as the modern data stack (MDS).
What is the modern data stack, and how do we create one?
The modern data stack is not some revolutionary new technology that only cloud-native businesses and early adopters are implementing. It’s in fact a set of applications hosted in the cloud that are combined to enable the most efficient data integration.
An MDS is generally made up of the following tools:
- An ELT pipeline: ELT refers to “Extract, load, transform”. This is the stage in our MDS where data is moved from different sources such as web apps like Salesforce or databases like MySQL into a form of storage. For example, our partner, Fivetran, provides solutions to manage the multiple steps in the ELT and automate data integration.
- A data destination: This is the data warehouse or data lake where the data transported by the ELT ends up. The data at this stage can be accessed and analyzed. A commonly used tool is Google BigQuery, a cloud data warehouse that allows for fast and easy analysis of your stored data.
- A data transformation tool: This is a tool to transform the data sitting in your destination into user-friendly models that can be easily queried and used for insights. You can consider Astronomer powered by Apache Airflow for this component of your MDS.
- A data visualization platform: Here, the data is visualized in dashboards so that non-IT employees can understand it and apply it to their work. A BI platform such as Looker, Looker Studio from Google Cloud will help to provide teams across your business with a common definition of their metrics.