Posted on

Analytics

The term ‘Analytics’ is becoming more and more commonly used…

Analytics are also the end game of the BI System Builders’ Cornerstone Solutions®.  But what are Analytics? This article describes the term as used by the BI System Builders. While some consultants in other organisations may have a different definition this paper will serve to make the BI System Builders thinking clear.

Firstly, it’s important to note that the term Analytics is frequently used interchangeably with other names such as Analytical Applications, Business Analytics, and Performance Management Applications to name but a few.  BI System Builders only use one terminology, that of ‘Analytics’.

Now, all organisations capture data. This may be through sales invoices, captured in a call centre, via a website, delivery notes from suppliers, research questionnaires, financial transactions, point of sale transactions and so on and so forth. The data captured is in a raw format and will usually be disjointed across several of your data capture and transaction systems. At this stage data viewed in reports is often referred to as operational reporting.  The data in operational reports is not summary but low level, detailed and granular. That’s fine but have you ever tried to read through a million rows of data in a day? That granularity may have use for operational purposes and will be heavily filtered but data for BI purposes needs to be at a higher level, or consolidated, or summarised in some way to make it readable.

The transaction system data captured has been growing in high volume over the years. It may be structured or unstructured data and can also be highly volatile. Furthermore the exponential growth of the internet, mobile devices and social media has meant the potential for enormous amounts of detailed information to be captured everyday – your web clicks, your buying behaviour, your communications, your location, GPS co-ordinates, your posts, your reviews, etc, etc.

Historically, Business Intelligence tools have been used to extract, clean and format the raw data then join it up together turning it into valuable and useful business information.  That’s great, it’s relevant and highly valuable in a structured data warehouse. The data is often summarised and may also be pre-calculated in other ways as well. The information can then be used to discover insights into your business. The reports are sometimes shared across the organisational enterprise, hence the term, enterprise reporting.

Let’s take a look at the classic Analytic capability in the data warehouse and then consider Analytics in the new Big Data era. Analytics go a step further than simple reporting by adding extra business value into the reports. This is done in several ways.  Some of the classic ways are the use of dynamic parameters for on the fly analysis such as period on period,  ‘drill down drill up’ through hierarchy structures, a drill across capability linking buisness process areas across the business, and predictive engines or ‘What If’ scenarios.

Dynamic parameters allow the user to refresh the data in the Analytic and change the question being asked by selecting contextual values in a prompt. For period on period analysis you may start by viewing sales today compared with sales yesterday.  The business user can then easily change the Analytic to compare current month to date sales versus last month to date sales or last year’s month to date sales etc. The Analytic below includes several calculated fields and all the analysis periods can be changed. When they are changed the other fields automatically recalculate. The Analytic also has a drill down capability and dynamic sentences which automatically capture values such as the name of the city with the highest percentage change.

Business Analytics

Drill down and drill up is used to navigate through a hierarchy in your data. For example you may capture sales data by location. In an Analytic you may view sales for all locations and then with a click of the mouse ‘drill down’ to sales by region, and then sales by city and down to individual sales outlets. This can all be achieved within a single Analytic.

The drill across capability allows you to navigate through a series of linked Analytics. These are often based on a logical business process flow.  The linkage is achieved through code written behind the scenes and invisible to the business user. Analytics are clever enough to know the context of the information you are viewing such as location, time period and product and pass this context to the next Analytic.

Predictive and What If capabilities allow the user to play out different scenarios. For example you could measure the impact of inflation. To do this a percentage value would be input into a dynamic parameter prompt.  An algorithm in the Analytic would then read the value and recalculate itself to show you the impact.

There are many other things that you can achieve with Analytics such as cycle time analysis.  The picture below is a cycle time chart created by BI System Builders using  SAP BusinessObjects XI 4.0 Interactive Analysis (Web Intelligence). In this case the chart is analysing the customer order actual cycle time but the technique can be applied to any business process. The information has great value. If you can measure the duration of individual stages within a business process you can identify those that are least efficient. By then addressing any inefficiency within a stage you can make it more efficient, reducing its duration and consequently cost, thus improving the profitability of the process.

It is also common to measure things such as KPIs, customer churn, customer life time value, stock turn, and do time series analysis, and develop strategy maps, balanced scorecards, and six sigma based statistical process charts across all your business process areas, not just one process area in isolation. Analytics based on the BI system are enabled to combine HR data with Finance data, and Finance data with Supply Chain data and so on and so forth. The bottom line is that Analytics will help you find efficiency and effectiveness in your business according to your needs. This type of Analytic is currently used by many organisations although many still struggle to realise them. They are also demonstrated through Google Analytics, LinkedIn Analytics, and YouTube Analytics.

Business Analytic Cycle Time

However, we no longer live in the stable and slowly changing world of the data warehouse alone. The cost of technology is reducing, the 64 bit processor is here and large amounts of RAM can now be exploited. Furthermore, technological advancements have yielded in-memory applications such as SAP HANA and distributed architectures and file systems such as Apache Hadoop. The combination of these things means that enormous volumes of data can be ‘released’ from the systems in which they have been captured and processed fast – near real-time, and real-time fast. Things that were once beyond budget are now starting to come within budget.

Earlier I asked the question about have you ever tried to read through a million rows of data? I wasn’t joking, I’ve witnessed business users attempting to do this is in a report and then find the bits of information that they needed. Ofcourse, they give up quite quickly and don’t get repeat opportunities because of the system degradation such report processing can cause. Now I’ll ask another question, “Have you ever tried to read though a hundred billion rows of data in a report?” Preposterous? – yes, impossible to process? – no, some organisations now capture data at the terabyte plus scale daily.

Hence the rise and rise of Analytics. It is impossible to make meaning out of such high volume, highly volatile and disparate data as is now becoming available without Analytics. And with such rich data available it will be exploited and it will be exploited in a pre-emptive way. Consequently the term ‘Data Scientist’ has been popularised and is closely associated with Analytics. As sophisticated as the new Data Scientists may be with algorithms and complex modeling the concepts of using decision engines, learning models, predictive engines, data mining, BAM, pattern identification, trending, time series analysis, correlation, regression, tests of significance, pattern detection, disparate source mapping, and identifying homogenous groups is not new. However, now the technology and the data is available these things will be exploited in Analytics like never before. For a clear example of this consider ‘Big Data means Marketing Science’, but that topic deserves an article all of its own…

Posted on

The End to End BI Concept

Considering the Full Landscape

Cornerstone Solutions® use an End to End BI approach and here’s why. When thinking about Business Intelligence architecture we need to consider the entirety of the component parts as a whole and not simply individually. In other words consideration is made of the full BI system as a whole and the individual components are not treated in exclusivity of each other. This is because the components will need to act interdependently regardless of whether they have been procured from a single or multiple software vendors.

In this respect the BI system is similar to the human body i.e. everything is so closely related that a felt symptom in one area (a pain in the arm) may be caused by an unseen symptom in another part of the body (a problem with an internal organ). To relieve the pain felt in the arm we treat the unseen causal effect in the internal organ. The same concept of unseen relationship and causal effect applies to the Business Intelligence system. Data flows from end to end through the Business Intelligence system in a similar way to blood circulating in the body and must not be blocked, lost or corrupted at any stage. The BI Architect must ensure this.

End to End BI
The components in the BI system are akin to a linked interdependent chain

Interdependency of the BI system

The early phases of a BI implementation can be usefully considered as akin to those of the Rational Unification Process (RUP) stages of Strategy, Inception, Elaboration, and Construction. A well defined BI strategy is very important. However, perhaps paradoxically the Construction phase is often delivered using an Agile delivery method.

Construction frequently commences with source system analysis, end user requirement gathering and the installation and configuration of the software components. If using SAP Business Information Warehouse, cubes and queries will be developed, or if a relational platform is used a dimensional modelling exercise is undertaken and then the physical tables are developed. The ETL system is designed and developed and reports and dashboards are built.

The relationship between these things is one of interdependency. It’s like a linked interdependent chain. This is why in the implementation methodology of BI System Builders we practice our philosophy of End to End BI. End to End BI takes the view that as each component in the BI system has interaction with and therefore dependency on its related components, BI Breakpoints can occur. BI System Builders take full consideration of the interdependencies in the landscape to ensure prevention rather than cure in their End to End BI project delivery method.

Posted on

Avoid BI Breakpoints With Cornerstone Solutions

BI System Builders in their End to End BI philosophy refer to the concept of BI breakpoints. The application of our Cornerstone Solutions® will help you avoid BI breakpoints, but what are they?

BI Breakpoints

Here’s an example of a small breakpoint in a BI system with relational tables that illustrates the escalating effort and associated cost. It starts with a poorly defined report specification. In this example no one identifies the specification as being problematic. So the report is developed according to the specification. However, when the report reaches User Acceptance Testing (UAT) it fails. After consultation with the end user that carried out the UAT it is understood that two important report columns are missing, so the report specification is changed. However, the new data columns are not supported by the dimensional model, so this also must be changed requiring modelling and DBA work. This consequence in turn means that the universe and the ETL packages must be changed and so on and so forth. You can see here the interdependence and knock on effect between the components leading to escalating effort and cost.

However, if you know what to look for you can pre-empt where the BI breakpoints might occur and put quality gates in place. Here are some examples of where BI breakpoints might occur in the dimensional model design of a relational database:

  • The use of a data warehouse generated artificial primary key rather a composite primary key on a fact table
  • Snowflaking in a relational model (not SAP OLAP)
  • A table in which its nature is not made explicit though clear naming convention
  • Table joins leading to chasms or fan traps
  • Fact to fact joins (there are exceptions)

Now a dimensional model design can be tricky to fully evaluate while it remains purely on paper or a computer screen but it is possible at this point to put a quality gate in place to minimise the risk of future BI breakpoints.  Firstly, it can be checked for best practice design principles to avoid the types of things listed in the bullet points above.  Another quality gate can occur after the physical model has been generated. This quality gate is to execute business questions against the tables using Structured Query Language (SQL) and to validate that they can be answered. To choose the business questions to be executed via SQL refer to the report specifications or consult the business users.

It is better to write pure SQL statements using a SQL tool than to use SAP BusinessObjects to generate the business questions in queries. It’s a false economy to wait for Web Intelligence reports to be designed to test the physical model. The reason for this is that the design should be validated as soon as possible, and before the universe is generated. To achieve this some test data must be loaded into the tables but it is far more efficient to identify a design weakness at this stage than later on where changes to the physical model will have a knock on effect on the universe design and any dependent reports.

If it becomes apparent in the quality gate that overly complex SQL has to be generated to answer a business question then the design should be revisited and optimized. The business questions should be focussed around any fact to fact joins, snowflaking or bridging tables as these are potential BI Breakpoints. Their existence can lead to very poor performance, miscalculations due to chasms and fan traps, and the prevention of drill across.

To minimise the risk of BI breakpoints occurring the BI Architect should introduce best practice principles along with quality gates early on. Here’s an example of the best practice principle of designing aggregates. The use of aggregates upfront can address three breakpoint areas before they occur:

 

  • Slow query response times
  • Report rendering times
  • Overly complex reports

It is of course vital to select the correct aggregate type to support the end user requirement. There are several different types of aggregates for example invisible aggregates (roll ups), new facts (ETL pre-calculated fields), pre-joined aggregates and accumulating snapshots etc. Aggregates allow the processing effort to be pushed way from the report and into the database engine and ETL program where it can be more efficiently executed.  However, the use of specific aggregates should not be decided upon until the detailed report specifications are available. Choosing the most appropriate aggregates to support the reports requires skill and experience and is not a menial task.

Aggregate tables can significantly reduce the size of the row set returned to the BI reporting tool. As a rule of thumb it is always better for BI reporting tools to work with smaller row sets. It is always best to force as much processing effort as possible back to the ‘gorilla’ of the BI system – the server itself. It can be even better when processing effort and calculations are forced back into the ETL program. However, care should be taken to educate end users when the effort is pushed back into the ETL program for calculated facts, so called ‘New Facts’, because the facts may be semi-additive or not additive at all.

The downside of using aggregate tables is the increased maintenance e.g. administering ten tables instead of five and the ETL effort required to support them. However, their benefits outweigh their cost.

BI breakpoints can manifest in far more areas than just relational tables, they may be seemingly invisible, and costly when they occur because of the independencies of the BI system. However, applying governance to the BI system around breakpoint areas is not always the most popular notion at implementation. This is because it requires the effort to think clever and to apply hard work, rigour, and discipline upfront to pre-empt problems that are almost invisible. We may be tempted to take short cuts, but, it’s better to apply the extra early effort. Entering into small skirmishes and battles in the early stages to iron out BI breakpoints is much better than allowing them to go unchecked and develop into big fire fights later. Regardless of whether your BI system relies on SAP OLAP cubes and queries or relational tables BI breakpoints are a real threat.