Data Virtualisation: Diet for Big Data
Ash Ashutosh, founder and CEO at copy data virtualisation endor Actifio, recently shared his vision on what data should play in a software-defined data centre at VMworld.
While infrastructure virtualisation for more and more companies is already part of everyday life, the virtualisation of data for many is still uncharted territory. But this is the right concept to stop the flood of data, which often results from multiple data copies. For this purpose, a completely new approach for managing data copies is needed.
I recently shared with delegates at VMworld that in the vision of a software-defined data centre, data should play the key role. During my presentation, I outlined new solutions to make virtualised environments more effectively and more efficiently through virtualisation of data.
I talk almost every day with customers who want to customise their data centres to meet the needs of tomorrow, but fail because application data is still linked to the infrastructure of yesterday.
One way out of the daily growing flood of data promises data virtualisation like offered by Actifio, as a specialist in this area. The new SLA-driven solution from Actifio makes a conventional data management infrastructure redundant and so makes an end with the endless duplication of data copies. Various data silos and individual applications for data backup, disaster recovery, business continuity, compliance, analysis, and test and development will no longer be needed. Instead, there is a single “golden master copy” available, as many virtual copies of data from any time - can be made available - in real time and without large memory requirements.
The background of the approach of Actifio: The analysis of big data promises strategic benefits, but it only needs to capture huge amounts of web, network, server log and social media data. The total amount of data is growing steadily. According to an estimate by analyst firm IDC, however, almost two-thirds of the stored data in the data centre volume are countless copies of the same data or copies from extremely outdated data. These are mainly production data, generated by separate applications for data protection, data management, recovery, test and development, analysis, etc.
Therefore, it is necessary to store data efficiently and intelligently and manage. The benefits of big data ultimately depends on whether you make large amounts of data in the required quality, within a short time and with little effort are available to be analysed and interpreted. This is confirmed by data virtualisation in accordance with big data experts that the amount of data that is already there, has to be analysed. Therefore, we must first eliminate the redundancies, a prerequisite in order to analyse the data in a meaningful way.
This problem is, however, often ignored. In many places, IT is still busy, to get under control the flood of data generated in large part by intentional or unintentional copying. Here are just symptoms cured, rather than to devote to the cause. Therefore, a strict diet data is called for in order to avoid unchecked and uncontrolled data growth in advance. Simultaneously, a fitness programme is necessary to make the data more agile and to enable a scalable and distributed use, and in the cloud. However, traditional software and hardware structures are contrary to this therapy.
So far, few companies have opted for a holistic approach to data management, namely the creation of virtual copies of their corporate data. Only then, can they decouple the data from the underlying infrastructure. Just as it has led the way virtualisation of servers a decade ago, it is now time to say goodbye to physical data and application silos. Instead of multiple redundant copies that are in circulation, a “golden copy” of production data is enough. From this copy countless virtual copies can be created, without wasting unnecessary space on the server. The sooner unnecessary physical copies of data are reduced, the lower the time, infrastructure and cost of managing and archiving. And: The streamlined dataset is now fit for efficient big data analysis that delivers results, instead of even more data.