Leveraging your data assets to enable Big Data

Posted by at 11:30h

Getting an edge on competition with Big Data and Analytics is top of mind in almost every business today. Whether the objective is to create a customer centric business approach or to improve an existing operational or performance excellence business model, the focus is on how data and prediction models can strategically improve the business. Success of the new projects will to a high extent depend on the amount and quality of the data that can be analyzed.

To give the new Big Data initiative a chance to make a difference, it is often best to launch it with additional resources bringing in new data science skills reporting directly to executive leadership. This is because the new project should ask new questions that will challenge old assumptions and traditional BI thinking. However, separating the new team too much has its risks and it is recommended to include at least one change-oriented and systems-knowledgeable old-timer. Otherwise there is a risk that the new project team becomes preoccupied with collection of data from the abundance of new data sources like mobile devices, social media, internet of things, etc. while overlooking existing enterprise data.


Companies typically have huge quantities of existing data related to its customers, products, competitors, and operations. The advantage of this data source is that it is highly relevant to the business and it does not have to be collected or purchased. The disadvantage is that it is poorly organized and need to be discovered in the maze of IT systems the company created over time. A blend of existing data and new collected data is often the best starting point for big data analytics.

The challenge with existing data is that it was never collected and stored with analytics in mind. Most of the data has been generated for disparate purposes and stored in different systems using different technologies. Furthermore, there is frequently a lot of duplication and missing data as well as data that is hard to identify what it is as metadata is rarely available or poorly maintained.

In theory it is possible for DBAs and SMEs to identify data manually but not in practice as the amount of data stores, tables, and columns is very high. An automated approach is required for success and that is why ROKITT created its Enterprise Data Management solution (ROKITT Astra).

ROKITT Astra automatically discovers the data in all databases including how data elements are related. The discovery leverages multiples techniques to discover relations including machine learning in order to minimize the amount of manual work required. Once the discovery is completed, it is easy to explore and understand the existing data landscape and extract data. As the data relations have been discovered, all data that is relevant for the analytical hypothesis can be found and extracted. This makes it more likely that features with strong prediction value can be found.

Getting the data is a big part of any data science project. ROKITT Astra is a tool that makes it easy to find data emanating from existing enterprise data stores. As this data has been gathered in your business operation, it is highly likely to contain very relevant information about your customers and their attitudes to your products and services. Furthermore, you already have this data. All you need is to rediscover an asset you have and make good use of it.

Click here to download ROKITT Astra white paper