You are on page 1of 11

BW Multicube in BW

Basic concepts

25. August 2017

INTRODUCTION...................................................................................................1

EXAMPLE SCENARIO............................................................................................2

DEFINITION OF THE INFOCUBE-UNION / OF THE MULTICUBE...........................3


Characteristics..................................................................................................................................3
Key figures.......................................................................................................................................3
Building the Multicube.......................................................................................................................3

MYTHS OF THE JOIN INFOCUBE..........................................................................3

QUERY PROCESSING WITH MULTICUBES...........................................................3


Query Pruning...................................................................................................................................3
Technical breakdown of a query in the Data Manager.........................................................................3

Introduction
Up until now the concept of the BW has been to view the Infocube as the only reporting unit. This means that
(a) queries only have access to the data in one Infocube and, conversely (b) the users must store all of the
data for current and future queries in a single Infocube. Therefore, the following problem situations exist:
There are reporting scenarios, that actually contain several subordinate scenarios (see example below).
Many requests are, in effect, concentrated on one of these subordinate scenarios. However, a few
requests need answers from two or more sub-scenarios. In order to satisfy such requests, all sub-
scenarios must be put together in one, common infocube. Unnatural and / or large modeling, on the one
hand and unnecessary processing costs on the other, arise as a result, for requests that are only actually
valid for a sub-scenario, whose processing must also support a great deal of irrelevant data from other
sub-scenarios. .
Point of view (b) points to a challenge in the data concept, namely, to also bear in mind future
requirements. Otherwise InfoCubes become unnecessarily overloaded in many cases.
Comment: The concept described in this paper up until now is known as a so-called Join-Infocube. However,
as we will see below, there is no similarity to a database join, rather, far more to the traditional union
operator. Therefore, when talking about a Multicube we are dealing with a Union-Infocube.

/conversion/tmp/scratch/364151295.doc Seite 1
Example Scenario
A typical example of a scenario, that contains several sub-scenarios, can be found in the area of "Sales and
Distribution" (SD) (now known as Sales and Marketing?). Here, you could, for example, follow an order
through various different stages:
Order (ORDER)
Delivery (DELIVERY)
Billing (BILLING).
Each of these stages corresponds to its own sub-scenario, that can be described with its own key figures
and partly jointly owned (with other sub-scenarios) and partly own characteristics. In order to form a
comprehensive example for the discussion below we are to assume the following characteristics (C) and key
figures (K):
Order (ORDER)

ONUM: Order number (C)


ODAT: Order date (C)
OPER: Salesman (C)
CUS: Customer (C)
PROD: Product (C)
OQUA: Order quantity(K)
OPRI: Order price (K)
Delivery (DELIVERY)

ONUM: Order number (C)


DDAT: Delivery date (C)
DPER: Supplier (C)
CUS: Customer (C)
PROD: Product (C)
DQUA: Delivery quantity (K)
DPRI: Delivery price (K)
COST: (Part-) costs for the delivery (K)
Billing (BILLING)

ONUM: Order number (C)


BDAT: Billing date (C)
BPER: Invoicer (C)
CUS: Customer (C)
PROD: Product (C)
BQUA: Billing quantity (K)
BPRI: Billing price (K)
COST: (Part-) costs for the invoicing(K)

/conversion/tmp/scratch/364151295.doc Seite 2
If the InfoObjects above were all stored in a single Infocube SD, then constellations such as the one
displayed in figure 1 would result. As, in the main, an independent data source exists per sub-scenario, the
data for each sub-scenario will be loaded at different times. Therefore, large areas exist with dummy entries.
(* or 0).
Comment: Both here and in the rest of this paper, InfoCube data is displayed in a flat structure. The display
of InfoCube data is completely independent of the multicube concept. In BW Infocube data is stored in a
relational Star Schema. In principle, you could also use any other multi-dimensional store. Here, for the sake
of simplicity, a "flat display " has been used.

PRO OQU DQU COST


ONUM CUS ODAT OPER DDAT DPER BDAT BPER OPRI DPRI BQUA BPRI
D A A

1 C1 P1 1998 X1 * * * * 5 100 0 0 0 0 0

2 C2 P1 1998 X2 * * * * 10 200 0 0 0 0 0

3 C1 P2 1997 X3 * * * * 4 130 0 0 0 0 0

4 C2 P2 1997 X2 * * * * 8 150 0 0 0 0 0

4 C2 P2 1998 X2 * * * * -2 -40 0 0 0 0 0

1 C1 P1 * * 1998 X2 * * 0 0 5 100 0 0 20

2 C2 P1 * * 1999 X1 * * 0 0 7 120 0 0 20

2 C2 P1 * * 1999 X2 * * 0 0 3 80 0 0 20

3 C1 P2 * * 1998 X1 * * 0 0 2 60 0 0 20

4 C2 P2 * * 1998 X2 * * 0 0 6 110 0 0 25

1 C1 P1 * * * * 1999 X1 0 0 0 0 5 100 10

2 C2 P1 * * * * 1999 X1 0 0 0 0 10 200 10

3 C1 P2 * * * * 1998 X2 0 0 0 0 4 130 10

Diagram.1: Three sub-scenarios in an Infocube SD

Alternatively, one could also imagine three man (initially) independent InfoCubes (see diagrams 2, 3 and 4).

ONUM CUS PROD ODAT OPER OQUA OPRI


1 C1 P1 1998 X1 5 100
2 C2 P1 1998 X2 10 200
3 C1 P2 1997 X3 4 130
4 C2 P2 1997 X2 8 150
4 C2 P2 1998 X2 -2 -40
Diagram. 2: The subscenario Order in an Infocube ORDER.

ONUM CUS PROD DDAT DPER DQUA DPRI COST


1 C1 P1 1998 X2 5 100 20
2 C2 P1 1999 X1 7 120 20

/conversion/tmp/scratch/364151295.doc Seite 3
2 C2 P1 1999 X2 3 80 20
3 C1 P2 1998 X1 2 60 20
4 C2 P2 1998 X2 6 110 25
Diagram. 3: The sub-scenario Delivery in an Infocube DELIVERY.

ONUM CUS PROD BDAT BPER BQUA BPRI COST


1 C1 P1 1999 X1 5 100 10
2 C2 P1 1999 X1 10 200 10
3 C1 P2 1998 X2 4 130 10
Diagram. 4: The sub-scenario Invoicing in an Infocube BILLING.

Using this example the problems are underscored:


Queries that only affect one sub-scenario (for example, total delivery quantity to a certain customer) can
be answered both from Infocube SD and from the respective Infocube for the sub-scenario to the same
degree. However, there will be differences in performance the Infocube SD is certainly larger and
more data needs to be processed.
Queries that span more than one sub-scenario (for example, outstanding deliveries for a customer), can,
at the present time, only be answered using Infocube SD.

/conversion/tmp/scratch/364151295.doc Seite 4
Definition of the Infocube-Union / of the Multicube
Let us assume that A, B and C are Infocubes, whereby an Infocube is initially viewed as a broad structure as
in diagrams 1-4. C consists of a union of A and B and is referred to as the Multicube:

C=A B
CA, CB, CC, KA, KB, KC are the quantities of the characteristics 1 and key figures that appear in A, B and C.
Therefore, the following is valid for CA, CB, CC or. KA, KB, KC:

CC CA CB

KC KA KB
The following comments on characteristics are equally valid for navigation attributes. For the latter, however,
it is the case that an attribute, that has not been switched on as a navigation attribute in Cube A or Cube B,
certainly can be a navigation attribute in Multicube C.

Characteristics

(M1) Trivial derivation rules: The following situation results here: characteristics with the same technical
name can exist in both A and B. In practice this means that they are present in the same column of
Multicubes C, if they are to be used in it. More formally put, the following rules result for a characteristic, c
(where c is the technical name):

c CA c CC
c CB c CC
In other words: characteristics that are technically named the same are semantically the same objects. In
the example displayed above these are, for example, ONUM, CUS and PROD.
If you want to highlight differences you can use reference characteristics: let us assume in the example
above, that a (basic) characteristic PER stands for person. In order to avoid mixing up the orderer, the
deliverer and the invoicer (i.e to put them into the same column) referencing characteristics OPER, DPER
and BPER have been created on PER. In this way, they will be "kept apart", this means that they will also be
logically managed in different columns in the Multicube.
(M2) Explicit derivation rules: Conversely, it is possible that characteristics or navigation attributes, that
reference the same basic characteristic, can be brought together again in a Multicube. Therefore, one can
also imagine the following rules:

c MA c MC
n MB c MC
with c n and c, n referencing the same basic characteristic or c referencing n or vice versa. You can set
these kind of formation rules in the definition of the multicubes: Menu "Extras" > "MultiCube" > "Identify
characteristics". Here n can also be a navigation attribute. If, in the example above OPER, DPER and BPER
reference the same basic characteristic PER, then their columns can be brought together in a multicube.
(M3) Implicit derivation rules: Assuming there are the following implicit and explicit derivation rules c,
n are, in this case, real characteristics, c_a a navigation attribute for characteristic c:

c C c CC
n CB n CC
1
inkl. potentieller Navigationsattribute; dies sind Attribute, die nicht unbedingt als Navigationsattribute fr
den jeweiligen Infocube aktiviert, sondern prinzipiell als Navigationsattribute verwendbar sind.

/conversion/tmp/scratch/364151295.doc Seite 5
c_a CA n CC
A further derivation rule can be extrapolated from these three, if c CB (and, therefore, c CB):

n CB c_a CC
These implicit derivation rules are calculated by the System and then used. The necessity of these implicit
derivations are made clear in the following mini scenario:

Infocube A
c c_a k1
X Y 100

Infocube B
n k2
Y 200

The implicit derivation rule produces the following logical data structure; the decisive cell is marked in bold:

Multicube C = A B
m m_a n k1 k2
X Y Y 100 0

* Y Y 0 200

Without the implicit derivation rule both of the following records would exist logically:

Multicube C = A B
m m_a n k1 k2
X Y Y 100 0

* * Y 0 200

Comment: Implicit derivation rules are first considered in BW 2.0B.

Key figures

If KA KB is not the case, then you have to decide whether key figures, that are both in A and B, are
redundant this would mean that values for such a key figure should only be taken from one cube or
whether they complement each other this would mean that values for such a key figure should be taken
from both cubes. As an example, one could imagine the key figure COST in the cubes DELIVERY and
BILLING. If total costs are stored in this key figure in both cubes then it would definitely be wrong to sum
the values across both cubes in a Multicube. The values for COST should, therefore, come either from
DELIVERY alone or from BILLING alone. However, if part costs are stored in COST (as in the scenario

/conversion/tmp/scratch/364151295.doc Seite 6
above), then it would probably make sense to use the values from both cubes and to logically merge them
in the same column in the Multicube.

Building the Multicube

The union of 2 Infocubes A and B into a Multicube, C , is formally defined in the following way:
1. You build a flat structure C from CC and KC.
2. The content of the flat structure A is filled in C, whereby for every column from C, that does not occur in
A, a type of "Null-Value" is entered (for example, * for characteristics and 0 for key figures; see diagram
2). In this case the implicit and explicit formation rules described in (M1) and (M2) should be observed.
3. As in 2.; but this time with the flat structure B.
The result is, in effect, a reconstruction of the systematic that is reflected in diagram 2. Similarly to the
traditional operator the following is valid

|C| |A| + |B|


which also contradicts the previous naming of Join-Infocube.

Diagram 5 shows an example for the Multicube C = ORDER DELIVERY. The display shows the logical
view of the Multicube C. All queries on C logically run over this dataset.

ONUM CUS PROD ODAT OPER DDAT DPER OQUA OPRI DQUA DPRI COST
1 C1 P1 1998 X1 * * 5 100 0 0 0

2 C2 P1 1998 X2 * * 10 200 0 0 0

3 C1 P2 1997 X3 * * 4 130 0 0 0

4 C2 P2 1997 X2 * * 8 150 0 0 0

4 C2 P2 1998 X2 * * -2 -40 0 0 0

1 C1 P1 * * 1998 X2 0 0 5 100 20

2 C2 P1 * * 1999 X1 0 0 7 120 20

2 C2 P1 * * 1999 X2 0 0 3 80 20

3 C1 P2 * * 1998 X1 0 0 2 60 20

4 C2 P2 * * 1998 X2 0 0 6 110 25

Diagram 5: Multicube C = ORDER DELIVERY (see diagram 2 and 3), only with implicit derivation rules.

Diagram. 6 shows an example for the Multicube C = ORDER DELIVERY.

ONUM CUS PROD ODAT OPER DDAT OQUA OPRI DQUA DPRI COST
1 C1 P1 1998 X1 * 5 100 0 0 0

2 C2 P1 1998 X2 * 10 200 0 0 0

3 C1 P2 1997 X3 * 4 130 0 0 0

4 C2 P2 1997 X2 * 8 150 0 0 0

/conversion/tmp/scratch/364151295.doc Seite 7
4 C2 P2 1998 X2 * -2 -40 0 0 0

1 C1 P1 * X2 1998 0 0 5 100 20

2 C2 P1 * X1 1999 0 0 7 120 20

2 C2 P1 * X2 1999 0 0 3 80 20

3 C1 P2 * X1 1998 0 0 2 60 20

4 C2 P2 * X2 1998 0 0 6 110 25

Diagram 6: Multicube C = ORDER DELIVERY (see diagram 2 and 3), with explicit derivation rule DPER
DELIVERY OPER C.

Myths of the Join Infocube


Many users imagine a type of join of several infocubes when thinking of a Multicube. From a generic point of
view this is not possible, as illustrated by the following example (it shows the result of a relational natural-
join2 of the data displayed in diagrams 2 and 3 (ORDER and DELIVERY):

ONU COST
CUS PROD ODAT OPER DDAT DPER OQUA OPRI DQUA DPRI
M
1 C1 P1 1998 X1 1998 X2 5 100 5 100 20

2 C2 P1 1998 X2 1999 X1 10 200 7 120 20

2 C2 P1 1998 X2 1999 X2 10 200 3 80 20

3 C1 P2 1997 X3 1998 X1 4 130 2 60 20

4 C2 P2 1997 X2 1998 X2 8 150 6 110 25

4 C2 P2 1998 X2 1998 X2 -2 -40 6 110 25

Diagram 7: Natural-Join between ORDER and DELIVERY.

The problematic cells are highlighted in bold: here, key figure values have been multiplied and many query
results become incorrect and unusable.
As you can see, such a situation arises because of the actual data available and not because of the definition
/ meta data of the multicube or the participating basic cubes. Therefore, you can draft perfectly sensible
examples for InfoCube Joins. From a generic point of view, however, these are not possible, as the example
in diagram 7 shows.

2
Der Natural-Join ist ein Datenbank-Join, dessen Join-Bedingung in der Gleichheit der (namentlich)
gemeinsamen Spalten besteht. In diesem Beispiel sind dies ONUM, CUS und PROD.

/conversion/tmp/scratch/364151295.doc Seite 8
Query processing with Multicubes
Fundamentally, one can understand that the query result is calculated on the basis of the flat data structure
of the multicubes. In reality, however, a query on a multicube is divided up into individual queries on the
participating basic InfoCubes. The latter can then be processed in parallel or in sequence. Diagram 8 shows
this split.

Diagram 8: Splitting of a Multicube Query into subqueries on the basic participating Infocubes.

Query Pruning

In certain cases you can decide in advance, that a sub-query cannot return any data. These cases are
defined more exactly later on. The following principle is important here:

NULL-values displayed as * in the formation they never fulfill an (inclusive) condition on the
corresponding column.

If, for example, you look at Multicube C = ORDERDELIVERY from diagram 5: if a query, containing the
restriction OPER = 'X2', were to be defined on this, then you can decide apriori, that the subquery on
Infocube DELIVERY cannot return any data, as characteristic OPER doesnt appear in DELIVERY and, as a
result, the relevant records from DELIVERY will contain NULL values.

A query with the same restrictions on Multicube C = ORDER DELIVERY, but this time from diagram 6,
could certainly return records from DELIVERY on the grounds of the explicit formation rule DPER
DELIVERY OPER C. This becomes obvious in diagram 6.

/conversion/tmp/scratch/364151295.doc Seite 9
Technical breakdown of a query in the Data Manager

Here the technical query structures, such as the set CHAR of characteristics (involved in the query), the set
KEYF of key figures and the set RESTR of restrictions. All these set are defined on the basis of the multicube
and are reformed in such a way that they define a suitable query on one of the subcubes. The relevant
algorithms, that derive the corresponding structures for the subquery on InfoCube A from the structures of a
Multicubes C, mentioned above are described in the following section.
Characteristics / Navigational attributes of a query: CHARC CHARA

Assumption: an entry for the characteristic / navigational attribute c is in CHAR C and should be transferred to
CHARA:
1. c is available in Infocube A: transfer the entry unchanged into CHAR A (trivial derivation rule).

2. c is not available in Infocube A: there is, however, an explicit derivation rule, n CA c CC :


transfer the entry for c into CHARA and replace c with n whilst doing so.

3. c is not available in Infocube, there is, however, an implicit derivation rule n CA c CC :


transfer the entry for c into CHARA and replace c with n whilst doing so.
4. Else: this column must be filled with neutral (initial) values.

Key figures of a query: KEYFC KEYFA

Assumption: An entry for key figure k is in KEYFC and should be transferred to KEYFA:
1. k is in Infocube A and values from A should be transferred to Multicube C: transfer the entry
unchanged into KEYFA
2. Else: this column must be filled with neutral (initial) values.

Restrictions of a query: RESTRC RESTRA

Assumption: an entry for the characteristic / navigational attribute is in RESTR C and should be transferred to
RESTRA:
1. c is available in Infocube A: transfer the entry unchanged into RESTR A (trivial derivation rule).

2. c is not available in Infocube A: there is, however, an explicit derivation rule n CA c CC :


transfer the entry for c into RESTRA and replace c with n whilst doing so.

3. c is not available in Infocube, there is, however, an implicit derivation rule n MA c CC :


transfer the entry for c into RESTRA and replace c with n whilst doing so.
4. Else: The entry is not transferred into RESTR A. At this juncture the decision can also be made as to
whether the sub-query on A makes any sense at all (see the section on "Query Pruning"):
a. If you are dealing with a generally valid condition and not just one that contains
EXCLUDING ranges, then the query on A makes no sense as the logically available NULL
values for all records in A could never fulfill this condition on c.
b. If you are dealing with a condition based on a query column, then the query cannot, at least
for this column, contribute any values. Therefore, all conditions in RESTR C for the same
query column are irrelevant for RESTR A and, therefore, should not be transferred any more
or, if already available, are to be drafted from RESTR A.
c. If, because of 4b all conditions for query columns gradually disappear, meaning that
ultimately RESTRA contains no more conditions for query columns, while RESTR C definitely
contains such entries, then this means that Infocube A does not return any values for any

/conversion/tmp/scratch/364151295.doc Seite 10
of the requested query columns. Therefore, the query on A does not make any sense in this
case and can be ignored.

/conversion/tmp/scratch/364151295.doc Seite 11

You might also like