Posted on

The Aggregate Advantage

This article is about the advantage of using aggregate tables in your data warehouse when using Web Intelligence as your reporting tool. Whilst it is possible to configure a SAP BusinessObjects universe for use with Web Intelligence on low level detail database tables such as transaction tables, it is rarely a good idea to do so.

The primary purpose of Web Intelligence is to allow users to perform ad hoc analysis, speedily slicing and dicing data and drilling up and down through it. The speed of this on the fly analysis is possible due to the architecture of Web Intelligence. When a query is executed Web Intelligence creates a small cube of cached data referred to as a microcube. Note the word ‘microcube’. It is not a macrocube, it is not an OLAP cube and Web Intelligence is not an OLAP tool, so we must not try to use it as one. In our experience we have found it best not to overload the microcube with enormous amounts of cached data.

You should go to the transaction tables with Web Intelligence only when you really need to get to that level of detail. If your query returns 10000 rows of data from a transaction table are you really going to scroll through all those rows for analysis purposes? If your answer really is yes, you might want to consider using Crystal Reports instead because it has a different underlying architecture more appropriate to that type of reporting.

If your answer is no, then let’s consider a reporting example. In this example your reporting requirement is to view your sales revenue figures by location, by product category, by year and by month. Now you could run your query against transaction tables and then use Web Intelligence to sum up your revenue. This approach has three weaknesses. The first is that the actual SQL code generated by your query has to execute against very big database tables and so may be slow. The second problem is that you may be returning thousands of rows to your microcube. The third issue is that you are then asking the Web Intelligence program to sum up or aggregate these thousands of rows of data and in doing so you are failing to exploit the power of your database server and instead stressing the Web Intelligence report engine.

There is a better way; your report is a candidate for aggregate tables as it is using summary data. So what are aggregate tables? Simplistically they are tables that hold data that has been summed up or aggregated in some way. Aggregated tables are not necessarily a simple summing up; they can also consist of other aggregates such as pre-calculated ratios, but the concept is a simple one. You take a transaction table, maybe holding 100 million rows of data, then using the Data Integrator tool you extract the data, aggregate it, and then load it into another table holding the summary data. In our example the table would be sales revenue by location, by product category, by year, by month. Now let’s say that because of the aggregation the summary table only holds 1 million rows of consolidated data, what effect does this have?

Firstly, the SQL query is faster because there is much less data in the table to query. Secondly, you are not asking Web Intelligence to do the summing up; that already happened behind the scenes in the data load. Thirdly, because the data is already summed up, significantly less data needs to be cached in the microcube, and consequently the Web Intelligence performance will be excellent.

This article is only a simplistic introduction to aggregation and clearly there is much more to consider than this. However, it has served its purpose, which is to point you in the right direction.

Posted on

Mapping Drilling Hierarchies

Business Intelligence Mapping Drilling Hierarchies

There are various types of hierarchy that range from simplistic to complex hierarchies with multiple level starting points. These complex hierarchies are often referred to as ragged hierarchies. Ragged hierarchies can occur because some of your source data systems may capture data at a very granular level whilst another system may capture or output data at summary levels. You may also be using third party data and wish to combine it with your in house data. When the data is consolidated it may not be possible to populate all levels of the hierarchy with the same coverage if at all.
In these situations it can be useful to undertake a mapping exercise with an outcome similar to that shown in the diagram. This should occur before modelling the hierarchies in dimension tables and the ETL build.