Is Enterprise Data Rationalization Possible?
Your data is a goldmine. At least it ought to be. For many enterprises, it is hard to use existing data to its full potential as the data landscape is plagued by redundancy, inconsistency, and missing relations. Rationalization and simplifications hold the promise of making it easier to analyze and derive actionable insights. But is it realistic to rationalize data in a large enterprise? In theory yes, but in practice it is rarely done. Why is that the case? In this blogpost we will consider common obstacles and suggest a path forward to a better data landscape.
Many problems stem from the fact that data assets are poorly managed. It is easier to create new databases with duplicates of data than to find out where relevant data may already exist. This is often aggravated by access constraints or organizational boundaries. While access rights management is important, denying access to a data in a particular database does not improve security if it results in the same data being stored in another database. In fact, one can argue that the duplication increases risk as intruders get two possible target locations for theft and/or sabotage. In addition to the increased security vulnerability, complexity grows for no good reason, which leads to increased costs and possibly two versions of the truth.
Another reason for complexity explosion in the data landscape is that without a clear data strategy, a lot of data is kept without much consideration whether it should be kept and if so, how it should be organized. Many times there is a lack of knowledge if data is of potential value and a simpler approach is to keep it just to be safe. This approach overlooks that the same information may already be available and all you do is to create an unnecessary copy. A better approach would be to investigate if this is new information with potential value and spend the effort to organize it along tidy data principles.
Finally, cleanup and destruction of unwanted data is rarely funded as future value is hard to quantify and as it is largely cost/risk avoidance, it competes poorly with new functionality implementation. It is worth mentioning though that systematic destruction of data that is no longer needed delivers many good effects such as reduced complexity, reduced compliance and security risks, as well as freeing up storage HW resources. All of them are important for smooth operations in IT and good cost and risk control.
As is mostly the case, solutions of business/IT problems require a mix of governance, management, and technical activities and tools. Many organizations have articulated IT Principles addressing data governance but struggle to implement them in the enterprise. One problem is that there are so many stakeholders that depend and touch upon the data. Another is that data is so pervasive and plentiful that it is impossible to manage without strong automation and powerful tools. These tools have hitherto not been available or have been unable to scale to real enterprise data landscapes. This is changing quickly with the emergence of automatic data discovery and data preparation solutions as well as powerful search.
The emergence of tools in this area (such as ROKITT Astra) provides the automation power that allows business and IT to jointly explore, simplify and organize data for better business value. The project should be based on your IT Principles and Enterprise architecture, driven by data owners from business, engineered by IT using automation tools.