Link to home
Start Free TrialLog in
Avatar of edfreels
edfreelsFlag for United States of America

asked on

Data Warehouse Fact Table Design Problem

I'm looking for a good idea on the way to layout my fact table and still follow best practice. This looked like a good forum for that. Here's what I have:

I'm designing a DW for the convenience store industry. This is just for transactional data that happens on the point of sale system. Each transaction may have multiple items and also multiple payment types. To explain in a real world scenario: Say the customer buys a coke and bag of chips and $12 in fuel. Say the total is $14. The customer then hands the cashier a gift card with $10 on it and pays cash for the rest. This leaves us with 3 items sold and 2 payment types.

One of the business rules is to be able to report on what type of items were purchased with a certain payment type. ie. Gift Card, Credit Card, Debit Card, Cash, etc. They also want to be able to report on the dollar amount of each payment type. And, of course, there is a need to report on item sales. Here's where I'm stumped. I have this many to many relationship between Item Sales and Payment types with the organization, date, transaction number being the one thing unique between the two. I have tried to find any kind of example of this type of design, but every design I've found only contains the Sales and doesn't address the payment of the sale in this fashion.

I have come up with a couple of ideas, but I'm not sure which way to go with this. I have attached an image of the fact table layout for the sales transaction not including the payment data. One option is to add the following columns to the FactSales table: PaymentKey1, PaymentKey1Amount, PaymentKey2, PaymentKey2Amount, PaymentKey3, PaymentKey3Amount, PaymentKey4, PaymentKey4Amount, PaymentKey5, PaymentKey5Amount. I really don't like this idea because it goes against all I've ever learned about good database design. This will also cause a challenge in reporting in having to pivot the data and then breaking it out into a separate data set to prevent duplication of amounts.

Another option I came up with would be to create a completely separate fact table for payment data and duplicate the organization, date and transaction number in this fact table so I can join the two fact tables together when reporting. So basically have 2 different star schemas that are related.  I don't know if that will effect performance though.

Any suggestions, thoughts or ideas on this are greatly appreciated!
 
  User generated image
Avatar of Tim Humphries
Tim Humphries
Flag of United Kingdom of Great Britain and Northern Ireland image

Hi,

A couple of things - to report the dollar amount against item and payment method (accurately? correctly?) you'd really need to be able to explicitly associate different item sales with different payment types. (all $10 Gift Card went on fuel, for example)

But from the scenario you describe you can't do this - the information just isn't captutred at source.
So I guess the best you can do here is to prorata the payment value between the items. So for the above, fuel is 12/14 (6/7) of each payment type and the chips and coke fight over the remaining 1/7 depending on their relative sell price (which you will know). So fuel has 6/7 * $10 attributed to Gift Card and 6/7 * $4 Cash.

As far as the fact table design goes, I would split the transaction into three - using payment method to make the transaction code unique. I'd just append 'GC' or 'C' or whatever to the transaction id you already will have just to make it unique.

You can then create multiple simple fact rows - one for each item and payment method combo, with their own sell price, sell date, product, store and payment method dimension references and the modified (unique) transaction id. You'll have 6 in this case - number of items sold * number of payments methods.

The structure becomes very simple. Sum all sales values for the transaction and you get $14. Sum by payment type and you get your $10 / $4 payment method split.

Only complexity here is a bit of work upfront perhaps to split your sales transaction into multiple components - how easy this is depends on what your source data looks like, but easy enough in SSIS or SQL.

Tim
ASKER CERTIFIED SOLUTION
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of edfreels

ASKER

Hi Tim,

Thanks for the input.  Unfortunately, breaking out the payment would show inaccurate information and, in the case where there may be 2 payment types for 1 item sold, this wouldn't work.  It also seems there would be a lot of duplicated information in the fact table with this theory.  Thanks though!

Hi Kent,

Aha!  That is a great idea!  I didn't think about using the ticket information as a dimension!  

Now, just thinking this through, there are 3 columns that make the ticket unique: Organization(Store), TicketDate and TicketId. One other factor I didn't share is, I plan on partitioning my fact tables on the TicketDateKey. So if I break those 3 columns out into a dimension, how is that going to work for the partitioning of the fact tables?

Hi Kent,

I just started my first cup of coffee this morning and didn't think that completely through before I responded.  I will just create a dimension table with those 3 fields and use that to create the foreign key to both fact tables.  Thanks so much for the advice!  This will work perfectly!

Ed
First cup of coffee?  I usually require that just to get out of bed.  :)

Glad that I was able to help,
Kent