Making sense of big data
New sources of data equate to new business opportunities, but with the exponential growth of data there’s a new challenge for companies – how to store, manage and extract value from it in a cost effective way
The ‘big data’ explosion that is unfolding is a result of three broad drivers that are coming together to produce “the perfect storm”. These are a growth in technology, people adopting said technologies, and of course, business needs.
“Data is no longer a by-product of running a business, it is the raw material needed to stay in business and compete effectively,” explains Jason Bath, head of business analytics, database and technology, SAP MENA.
“Businesses want to get their hands on as much detailed data as possible in order to extract information, get insight, and make actionable business decisions. Detailed customer behaviour across multiple touch points is captured and analysed,” he adds. “Enterprise applications are able to generate extremely detailed data; for example, mobile operators typically generate billions of call detail records in any given month.
“Emerging technologies, changing behaviour and the competitive business landscape are all forces that are coming together to produce the perfect data storm.”
New sources of data equate to new business opportunities, but with the exponential growth of data there’s a new challenge for companies – how to store, manage and extract value from it in a cost effective way.
Bath advises companies to consider relating the storage tier to the value of the data, use advanced database compression features that will save on data centre real estate and energy costs and avoid creating multiple copies of data.
“New applications are driving the demand for extreme performance and real-time access to information while, at the same time, increasing compliance regulations are driving the requirement to retain historical data. Therefore the need to match the storage technology to the value and access patterns of data becomes critical in achieving a balance between performance and cost,” he explains.
“For example, a real-time social media feed of customer sentiments following a major product launch needs to be analysed frequently and may be best stored in a high performance in-memory database, while rarely accessed archival data retained for compliance purposes may be more economically stored in traditional high density disk drives.
“However, with the rapid decline in the cost of memory, which today already offers lower cost-per-performance compared to traditional disk drives, it is becoming more cost effective to store databases entirely in memory. By looking at the rate at which the cost of memory is declining, it is predicted that by 2017, it will cost the same per capacity as disk drives.”
Many companies, including IBM, Oracle and SAP, are providing solutions that incorporate these kinds of strategies, which they hope organisations will turn to as they try to make the most out of big data in an economical fashion.
“Our position with respect to technology investment is that it has to be supportive of a company’s business objectives and must demonstrate an overall lower total cost of ownership,” says Ismael Hassa, sales director, Oracle. “Hence, a modular, scalable approach to next generation analytics is the recommended way forward,” he adds.
“The departure point for any company embarking on the quest to gain control of big data is to work with experienced people and proven technology with the outcome of firstly understanding the requirement and its impact on the business, conduct capacity planning with the goal to grow as needed and ensure investment protection along the way.”
It is these new technologies that allow organisations to perform meaningful analysis of large amounts of data, in turn providing business value.
“Big data technologies describe a new generation of technologies and architectures, designed so organisations can economically extract value from very large volumes of a wide variety of data by enabling high velocity capture, discovery, and/or analysis,” says Philip Roy, director, data computing division, EMC.
“This world of big data requires a shift in computing architecture so that companies can handle both the data storage requirements and the heavy server processing required to analyse large volumes of data economically. New ‘information taming’ technologies such as deduplication, compression and analysis tools are driving down the cost of creating, capturing, managing, and storing information to one-sixth the cost in 2011 in comparison to 2005,” he adds.
361 days ago
Keri, very informative article. One other open source solution worth looking at is HPCC Systems, a superior alternative to Hadoop. Based on a shared nothing distributed architecture, the HPCC Systems platform provides for an excellent low cost one-stop solution to BI and analytics needs. HPCC Systems is a mature, enterprise ready, data intensive processing and delivery platform, architected from the ground up as a cohesive and consistent environment to cover big data extraction, transformation and loading (ETL), data processing, linking and real time querying. Powered by ECL, a data oriented declarative domain specific language for big data, the HPCC Systems Platform enables data scientists and analysts to directly express their data transformations and queries, eliminating the need for low level Java programmers. More at http://hpccsystems.com