You are on page 1of 6

Dimensions: The dimension tables are where the attributes of the dimensions of the business are stored.

The best attributes are textual and discrete and used to constraint the fact table. Each of these textual descriptions helps us to describe the member of the respective dimension. They are the entry points into the fact tables. They determine the grain of the fact table. They serves as a primary source of query constraints grouping and report labels/row headers. They are relatively shallow in terms of rows but are wide with many large columns. They are not usually time dependent Hierarchical relationships. Robust dimension attributes delivers analytic slicing and dicing capabilities. Dimension tables are de-normalized. Examples of Dimensions: Employee, Time Product Customer etc

Dimension Keys: Dimensional Modeling proposes that the dimension keys should be surrogate keys. surrogate keys are integers assigned sequentially as needed to populate a dimension. They are also know as meaningless keys, integer keys, artificial keys, synthetic keys etc. Every join between dimension tables and fact tables in a data warehouse enviro nment should be based on surrogate keys, not natural keys. Primary Benefits of surrogate keys is that they buffer the data warehouse environment from operational changes. Avoid adverse impact on performance in case of composite natural keys. Avoid smart keys, Natural keys or Production keys. Keys where you can tell something about the record just by looking at the key are called smart keys. Data warehouse team is able to maintain control over the environment without getting Effected by operational rules of generating, updating, deleting, recycling and reusing production keys. Ex: Multiple sources using same keys, Production reusing the same values after data purge, Systems with different format keys being added at a later stage etc.

Slowly Changing Dimensions (SCD): In the real world, dimensions and their descriptions, though relatively constant, evolve over time employees come and go, they are promoted, salaries change etc. The term slowly changing dimensions is the variation in dimensional attributes over time. The word slowly in this context might seem incorrect but in general, when compared to a measure in a fact table, changes to dimensional data occur slowly. We need to have a strategy to deal with these changed attributes over time. When we encounter a slowly changing dimension we face making one of the following three fundamental choices. Each choice results in a different degree of tracking changes over time Type One (Overwriting History): A Type 1 change overwrites an existing dimensiona l attribute with new information. In the customer name-change example, the new name overwrites the old name, and the value for the old version is lost. A Type One change updates only the attribute, doesn't insert new records, and affects no keys. It is easy to implement but does not maintain any history of prior attribute values Type Two (Preserving history) Creating an additional dimension record at the time of the change with the new attribute values and thereby segmenting history very accurately betwee n the old description and the new description. Implementing Type Two changes within a data warehouse might require significant analysis and development. Type Two changes accurately partition history across time more effectively than other types. However, because Type Two changes add records, they can significantly increase the database's size. Type Three (Preserving a version of history) Creating new current fields within the original dimension row to record the new attribute values, while keeping the original attribute values as well, thereby being able to describe history both forward and backward from the change either in terms of the original attribute values or in terms of the current attribute values. You usually implement Type Three changes only if you have a limited need to preserve and accurately describe history, such as when someone gets married and you need to retain the previous name. Hybrid Type As an alternative, you can implement a mix of Type One and Type Two changes at an attribute level by implementing Type 2 changes for only attributes whose historical values are important when you're slicing and dicing. For example, users might not need to know an individual's previous name if a name change occurs, so a Type One change would suffice. Users might want the system to show only the person's current name. However, if the company reassigns sales territories, users might need to track who sold what, at what time, and in what territory, necessitating a Type Two change.

Rapid Changing Dimensions (RCD): In case of rapidly changing dimensions the dimension attribute values change rapidly over time. Note that there are no yardstick for telling when a dimension is slowly changing or not and this is based on the judgment of the data modeler. Also an SCD may become a RCD over time or vice versa. For RCDs the design followed depends on the size of the dimension Small dimensions: The same technologies as for slowly changing dimensions may be applied Large dimensions: The best approach for efficiently browsing and tracking changes of key attributes in really huge dimensions is to break off one or more mini dimensions from the dimension table, each consisting of small clumps of attributes that have been administered to have a limited number of values. Degenerate Dimensions: A degenerate dimension is represented by a dimension key attribute with no corresponding dimension table. Degenerate dimensions usually occur in line item-oriented fact table designs. Many of the dimensional designs revolve around some kind of control document like an order, an invoice, a bill of lading, or a ticket. Usually these control documents are a kind of container with one or more line items inside. A very natural grain for a fact tab le in these cases is the individual line item, In other words, a fact table record is a line item. the attributes on the order number automatically go over to these chosen dimensions e.g. Product, Customer, Time etc. At the end of the design, the order number is sitting by itself, without any attributes. We call this a degenerate dimension. The degenerate dimension key should be the actual production order number and should sit in the fact table without a join to anything. There is no point of making a dimension table because the dimension table would not contain anything . Junk Dimensions: A junk dimension is a convenient grouping of typically low-cardinality flags and indicators. By creating an abstract dimension, we remove the flags from the fact table while placing them into a useful dimensional framework. Sometimes after carving out all the dimensions some flags or text attributes are left over in the fact table but do not belong to any of the dimension tables. When a number of miscellaneous flags and text attributes exist, the following design alternatives should be avoided: Leaving the flags and attributes unchanged in the fact table record Making each flag and attribute into its own separate dimension Stripping out all of these flags and attributes from the design

A better alternative is to create a junk dimension. Conformed Dime nsions: Conformed dimensions can be used to analyze facts from two or more data marts. For example shipping and sales data marts both require a customer dimension and a time dimension. If theyre the same dimension, then you have conforming dimensions, allowing you to extract and manipulate facts relating to a particular customer fro m both marts, answering questions such as whether late shipments have affected sales to that customer. Adding a marketing data mart to analyze product promotions, with conformed customer and time dimensions, youre able to analyze the effects of a particular product promotion on sales. (Analyzing facts from more than one fact table in this way is termed drilling across. )

The same conformed dimensions in this case, time and customer dimensions have meaning in the context of three independently developed data marts. These dimensions become enterprise property and can be used later in other marts as the enterprise data warehouse evolves. Conformed dimensions have consistent definitions regardless of where they are used. This allows a single query to be run across multiple tables, Data Marts and Data Warehouses
Facts:

The fact table is at the center of a star schema and holds the primary measurement data. They contain the actual numerical measurements that the business is interested in. Fact tables express the many-to-many relationships between dimensions. A fact table typically has two types of columns: those that contain measures and those that are foreign keys to dimension tables. Some key features of a fact table are Multi part Key. I.e. a composite key with one foreign key for each dimension. Time is a always a part of the key Usually numeric. Keys are surrogate integers and the measures are numeric. Typically additive.

Granularity refers to the level of data in the fact table. The lowest granularity is referred as atomic data. The granularity is determined by the grain. The meaning of a

single record in a fact table is grain. The granularity also determines how far you can drill down without returning to the base, transaction system data. The lower the grain, the more records will be present in the fact table. we must make sure that the grain is low enough to support our decision support needs

Fact Types Additive Facts Additive facts are the measurements in a fact table that can be added across all dimensions. e.g., discrete numerical measures of activity, i.e., quantity sold, Sales dollars. Semi-Additive Facts Numeric Facts that can be added across some dimensions in a fact table but not across Others. e.g., Inventory levels and balances cannot be added along the time dimension but can be averaged usefully over the time dimension. Non-Additive Facts Facts that cannot logically be added between rows. May be numeric and therefore usually must be combined in a computation with other facts before being added across rows. If non-numeric, can only be used in constraints, counts or groupings. e.g., measurement of room temperature Fact less Fact Table A fact table that has no facts but captures certain many-to-many relationships between the Dimension keys. Most often used to represent events or provide coverage information that Does not appear in other fact tables. e.g., 1. Track student attendance at a college. 2. Promotion coverage fact to answer questions like "Which products were on promotion

that didn't sell?" not captured by the sales fact table

You might also like