Summary
A data warehouse is a central storage system that stores data from various sources. Data from sources like transactional systems, relational databases, and others flow into a data warehouse. In other words, a data warehouse refers to a collection of organizational data obtained from operational and external data sources. The main characteristics of a data warehouse are subject-based, integrated, non-volatile, and time-variant. It provides information based on themes or subjects rather than organizational operations. A data warehouse is developed by integrating data from different sources. In addition, it has time-variant keys such as date, month, and time. Therefore, a data warehouse is a large store of data retrieved from different sources.
The Difference Between Data Warehouse and Data Mart
Problems When Operational Data are Integrated into The Data Warehouse
There are different challenges experienced when operational data are integrated. One of the problems of integration in a data warehouse is data homogenization. The existence of similar data formats from various sources may result in the loss of valuable parts of the data. The other challenge is the difficulty of controlling access to data. An organization may be unable to differentiate and decide the department that must have to the warehouse. Another problem caused by the integration is the high cost of maintenance. There is a need to make sure that there is no reorganization of data as it comes from different sources.