Microsoft Office Tutorials and References
In Depth Information
Users are not normally aware of the existence of a technical data model because they interact
with the database using an interface—that is, a piece of software that can understand both
data models (the user’s and the technician’s ones), read data from the database in the technical
format, and display it to the user in the friendly format. Office workers taking orders therefore
do not really need to understand that orders are actually stored in at least these three technical
tables that compose the database structure of orders. Moreover, the more complex the structure
is, the higher the number of technical tables that you need to model it.
On the other hand, as you might have noticed already, PowerPivot lets you browse the
database as it is stored, without trying to make it look simpler. If the data is kept in three tables,
PowerPivot shows you columns from all three tables. If a table contains technical columns,
the pivot table does not hide them, even if we know that the user makes no use at all of
For this reason, you need to create a data model in PowerPivot that is nearer to the logical
than to the physical type. In this chapter, we show various examples in which you can see
prominent differences between the physical and the logical model. Whenever we find such
issues, we analyze standard methods to remove them to present the logical model to the user.
If you still remember the content of Chapter 2 (if not, you can always go back a few pages
to recall it), you now recognize that when you learned to use the RELATED function to hide
technical columns and technical tables, you learned the first steps for turning a physical
model into a logical one.
Normalization and Denormalization
Now we analyze one of the most common causes of the differences between the physical
and logical view of a database: data normalization . Normalization, in the database world,
means the removal of redundancy in the table structure.
Let us try to understand better what we mean by redundancy . If you think of the possible
ways to store products, categories, and subcategories on a disk file, you have these two
■ You can store the textual description of category and subcategory in each product,
as the logical model requires. You end up repeating the same description for each
product with a specific category.
■ On the other hand, you can store the textual description of each category only once
and then reference it from subcategories by using a number, which usually takes less
space than the description. Moreover, you can perform the same operation with the
subcategory, storing its description only once, in the subcategory table, and using a
reference from the products table that links to the subcategory. This technique leads
to a smaller file, even if it is harder to read.