Get your house in order

Data warehousing is becoming a key IT component of companies’ business processes, but outsourcing this function can lead to problems if firms do not approach it strategically

  • E-Mail
By  Published  November 17, 2006

A large retail chain currently has implemented a data mart that is very effective in meeting its structured reporting requirements. The firm is investigating designing and implementing a data warehouse solution that would be more effective and efficient in supporting its ad-hoc query and analysis requirements. It envisions that the data warehouse solution could leverage the same interfaces that are already implemented to support its existing data mart.

One option that the company is considering is to outsource this data warehouse that would provide its key users with an optimised solution for their ad-hoc information requests on a per query fee basis. The client is interested in knowing if other enterprises have successfully implemented this approach along with the solution providers that they have utilised.

Data warehouse infrastructures are becoming the central, value-add platform for most companies. The information and value generated from processing that information through data mining, online analytical processing (OLAP) or reports enables managers to make better, timelier decisions.

For many well-defined business processes, new analytic applications that access the warehouse can automate the decision making process by interacting directly with supply chain or demand chain applications. Due to its strategic nature, it is rare to see a company outsource its enterprise data warehouse.

The term outsourcing by itself is vague at best. It could mean anything from the use of outside consultants to strengthening a project team, to handing over the entire development, hosting and management of the warehouse to a third party.

Guidelines

A company must first define the problem. Many companies make the mistake of jumping directly into “solution mode” without taking a step back and asking what the problem is first. This is often overlooked because until various lines of business (marketing, finance, sales, for example) get together to discuss it, the solution seems clear to those working only for a single constituency.

Even individual groups will outgrow a solution designed only to answer those questions that are known. As companies become more analytically mature, their questions will change, the data latency requirements will change, and the data sources will change. Indeed, the only thing that can be expected is that change will occur continuously, especially if you have developed a successful data warehouse environment.

Secondly, a company must identify its resources, time and budget. You must be realistic about your current in-house capabilities. Do not be afraid to tell executive management that you will need additional resources, time, or money as it is better to do this early on than wait until much later in the project.

If your internal resources are thin with respect to experience in developing an enterprise warehouse platform then pursue outsourcing options to bring in data warehousing expertise (architecture, sourcing and transformation, project management etc).

If you have under-utilised staff currently, get them involved in the project. Developing in-house expertise is the only real option to long-term data warehouse success. The initial development will set the foundation for the warehouse’s success going forward.

One benefit of outsourcing certain responsibilities is that once the phase is done you do not have the issue of where to deploy those individuals if hired specifically for this project.

You should expect that upfront costs will be higher due to the use of outsourced personnel versus in-house, but long-term, the savings should easily offset those costs through lower maintenance costs gained through the development of a more sustainable data warehouse platform.

A company should also decide how it wants to measure success. This is a crucial question and one that must be answered before heading down the road of providing a solution. Agreement before access is a phrase we have heard repeated many times by a number of warehousing experts.

The criteria should be stated in terms that are quantifiable. So, instead of stating that the warehouse must hold all your sales data, you should say something along the lines of “the warehouse must contain all detailed sales transactional data for current fiscal year plus the previous fiscal year. It must contain summary data for the prior five years of sales. It should expect 20% growth in both data and concurrent users annually”.

We have seen demand for specialty outsourcing of analytic processing. Typically this is done by third parties that specialise in specific niches such as marketing services for telcos or healthcare. These services are not multi-subject in nature the way a true enterprise data warehouse would be. They tend to be more data mart like and serve a specific need.

Architecture

Today the trend is towards building what is termed an “enterprise” data warehouse. The overall design goal of an enterprise data warehouse is to create a definitive version of the organisation’s business data. This is no easy task when you consider the number and variety of systems and silos of company data that exist within any business organisation. This means rationalising data entities like “customer” into a single view.

To create a single version of the truth for an organisation, it logically follows that an enterprise data warehouse must consist of multiple subject areas (such as finance, marketing, and sales), representing areas of interest both for individual groups and for individuals who must view data across several subject areas. It is important to note that multiple subject areas are a design goal, not a requirement for usage. Typically, subject areas are added over time.

In the past, analytic environments have had very denormalised data models. This was done both to satisfy performance and simplify the writing of queries. The trend now is for warehouse environments to be designed for flexibility first and performance second. Flexibility refers specifically to what is termed “query freedom” or the ability to ask any question. Since denormalised structures must anticipate the questions they can answer, the trend is towards keeping detailed (ie lowest level) transactional data and to do so in as close to a third normal form as possible. This makes loading and maintenance easier and normalised models tend to be more flexible because they better represent the relationship between business entities, not data items (as in denormalised models).

Perhaps the most important aspect related to the question of outsourcing the data warehouse is the critical nature of the warehouse itself.

Companies that are well down the road to analytic maturity in their business processes look upon the warehouse as being as critical as any order entry system. Indeed, it has become ingrained in their business process. For this reason, complete outsourcing of an enterprise data warehouse would be a very difficult decision to make.

Exploration

As companies mature with respect to the development and use of business analytics there is always one constant — information exploration always begins as an ad hoc query.

Over time the company matures analytically and queries that were once ad hoc become institutionalised and automated or pre-canned for consumption by users. So you should look on your ad hoc users as explorers; the relatively small group charged with leading the way towards discovery of critical business information.

When that group uncovers something useful — an easy path to that information is created for the other, larger group of users — think of them as settlers in the form of either pre-run reports that are made accessible to business users or by pre-canned queries that a business user can choose to run as they please.

In either case the query is well known and access plans and workload resource planning can be designed and tuned.

Due to the relationship between the two groups (explorers and settlers) we do not recommend a separate environment for ad hoc users of the warehouse. The first issue is one of cost — the cost of maintaining two, potentially large, environments. The second issue is one of information integrity. If the ad hoc user discovers something interesting, will the same query produce the same results in another environment? It is this reason why most companies are moving away from a data mart strategy to a single version of the truth enterprise data warehouse.

In our opinion, the decision to outsource certain aspects of the design and implementation of an enterprise data warehouse project is an easy one. We highly recommend it, especially if your in-house staff do not have the experience in completing a successful enterprise data warehouse project.

As for the decision to outsource a warehouse for use by “ad hoc” users, we see only trouble resulting from such a decision. If on the other hand, your organisation is not considering a true “enterprise” data warehouse with all the attributes detailed above and simply wants to provide a “reporting” instance for users to query as they wish, then that could certainly be outsourced — however such environments are typically easy to build and the premium paid to outsource that type of environment would likely not be justified.

Add a Comment

Your display name This field is mandatory

Your e-mail address This field is mandatory (Your e-mail address won't be published)

Security code