Microsoft Office Tutorials and References
In Depth Information
The core organizing principle of the data mart structure is that the database is composed of
two distinct types of entity:
■ Dimension A dimension is an analytical object. A dimension can be the list of products
or customers, the time space, or any other entity used to analyze numbers.
■ Dimensions have attributes . An attribute of a product may be its color, its
manufacturer, or its weight. An attribute of a date may be simply its weekday or its
■ Dimensions have both natural keys and surrogate keys . The natural key is the
original product code, customer ID, or real date. The surrogate key is an
independent integer number used in the data marts to uniquely identify a dimension entity,
joining it to related facts.
■ A dimension has relationships with facts. Its reason for being is that of adding
qualitative information to the numeric measures of an event contained in facts.
Sometimes a dimension might reference other dimensions, or it might correlate
to other dimensions even if its main purpose is that of joining to facts.
■ Fact A fact is something that happened or that has been measured. A fact may be the
sale of a single product to a single customer or the total amount of sales of a specific
item during a month. From our point of view, a fact is an event (usually represented by
a number) that we want to aggregate in several forms to generate our reports.
■ We normally relate a fact to several dimensions, but we do not relate facts in any
way with other facts.
■ Facts definitely have relationships with dimensions via the surrogate keys. This is
one of the foundations of Kimball’s methodology.
In all the examples you have seen up to now, Product is a dimension table, whereas
SalesOrderDetail, for instance, is a fact table. Some tables, such as SalesOrderHeader,
behave both as fact and dimension tables, thus violating the Kimball rule that says that
a table is either a fact or a dimension, but not both at the same time. This is not an issue
because—up to now—we have never queried a data warehouse, and it is normal for OLTP
databases to violate the rules of data warehouses because they are not data warehouses.
Having seen what facts and dimensions are, you are now ready to learn about a very well-known
structure called a Star Schema. A Star Schema is simply the shape taken by data warehouse
diagrams. When you define dimensions and create relationships between facts and
dimensions, a star schema is the natural result. The fact table is always at the center of the schema,
with the dimensions related to the fact table placed around it. Because the fact table is directly
related only to the dimensions, you get a shape resembling a star, as you can see in Figure
4-28, where we show a piece of the schema of the AdventureWorks data warehouse.