March 21, 2015
Just a note on Data before we get started, i loved this quote which i recently got to read –
“Data are not taken for museum purpose; they are taken as a basis for doing something. If nothing is to be done with the data, then there is no use in collecting any. The ultimate purpose of taking data is to provide a basis for action or a recommendation for action.” [an i quote from W. Edwards Deming, a renowned Statistician]
Before we could jump into details of types of Facts, let’s get some clarity on what a fact table is and what way it plays a role in a Warehouse. In the traditional Data warehouse model we have the Facts and dimensions, where the Facts play a major role giving more meaning and information to data and for the end users to get to use it on trend analysis and on many more reporting purposes.
As is noted and said by Kimball – “Fact tables are the foundation of the data warehouse. They contain the fundamental measurements of the enterprise, and they are the ultimate target of most data warehouse queries.”
Additive Fact tables
An additive Fact table is one, which has the transaction data giving the information on different dimensions of each transaction and the measurements (Amount/hours/value) on each. This is a kind of a fact table that most of the Data model would have used, with the measurements giving way for plenty of analysis and information. This is a kind of fact table which can bee summed/aggregated through all the dimension columns to get a grouped by analysis and result giving the summed Measurement/amount with respect to different dimension combinations.
Example is a sales table, where you have dimension like Customer, Date of sales, Sales ID and the measurements like Quantity and Amount/Price. In this case you can Group by a customer or the Day of sale or even by the Sales_ID to get an aggregated Amount and Quantity or both.
Semi Additive Facts
A semi additive Facts are the one which comprises of Dimensions and the measurement information, and the facts are summed up with respect to some of the dimensions and can not be summed on the rest.
A good example can be the banking balance tracking table, that gives the debit and credit information and the balance post any transaction of the respective Customer. Here it would not make any sense to sum up the amounts with respect to the date, which would be meaningless measurement. Whereas the same can be helpful to get the track on the balance amount on Customer wise.
Non Additive Facts
A non additive Fact would be with the measurements and dimensions where there would be no summing/aggregation that can give any meaning to the data.
An example can be the table giving information on the Discounts with respect to the Products and on the respective dates. Here there can be no meaning made with the Discount being summed with any of the dimensions to make the data informative.
The fact tables, completely built on the Dimensions only are called as the Fact-less ones. These facts basically help in getting the combination of different dimensions to make a meaning with the intersection of them.
A very good example can be a table giving information on the Promotional code for each product on a date, here all are dimensions and they would still make sense in the end user’s point to build it as a fact table bringing the Product for a date with the right Promo code assigned to it.
Hope this was helpful in understanding the basic types of facts that we can build as per the need and requirement on our data model. And the decision to pick the right Fact type has to always be a based on the scenario and the need of the end users (Reporting/Analytic).