Data Warehouse Fact Table Design Problem

Posted on 2011-03-24
Medium Priority
Last Modified: 2012-05-11
I'm looking for a good idea on the way to layout my fact table and still follow best practice. This looked like a good forum for that. Here's what I have:

I'm designing a DW for the convenience store industry. This is just for transactional data that happens on the point of sale system. Each transaction may have multiple items and also multiple payment types. To explain in a real world scenario: Say the customer buys a coke and bag of chips and $12 in fuel. Say the total is $14. The customer then hands the cashier a gift card with $10 on it and pays cash for the rest. This leaves us with 3 items sold and 2 payment types.

One of the business rules is to be able to report on what type of items were purchased with a certain payment type. ie. Gift Card, Credit Card, Debit Card, Cash, etc. They also want to be able to report on the dollar amount of each payment type. And, of course, there is a need to report on item sales. Here's where I'm stumped. I have this many to many relationship between Item Sales and Payment types with the organization, date, transaction number being the one thing unique between the two. I have tried to find any kind of example of this type of design, but every design I've found only contains the Sales and doesn't address the payment of the sale in this fashion.

I have come up with a couple of ideas, but I'm not sure which way to go with this. I have attached an image of the fact table layout for the sales transaction not including the payment data. One option is to add the following columns to the FactSales table: PaymentKey1, PaymentKey1Amount, PaymentKey2, PaymentKey2Amount, PaymentKey3, PaymentKey3Amount, PaymentKey4, PaymentKey4Amount, PaymentKey5, PaymentKey5Amount. I really don't like this idea because it goes against all I've ever learned about good database design. This will also cause a challenge in reporting in having to pivot the data and then breaking it out into a separate data set to prevent duplication of amounts.

Another option I came up with would be to create a completely separate fact table for payment data and duplicate the organization, date and transaction number in this fact table so I can join the two fact tables together when reporting. So basically have 2 different star schemas that are related.  I don't know if that will effect performance though.

Any suggestions, thoughts or ideas on this are greatly appreciated!
  FactSales Table Layout
Question by:edfreels
  • 2
  • 2
LVL 15

Expert Comment

by:Tim Humphries
ID: 35213478

A couple of things - to report the dollar amount against item and payment method (accurately? correctly?) you'd really need to be able to explicitly associate different item sales with different payment types. (all $10 Gift Card went on fuel, for example)

But from the scenario you describe you can't do this - the information just isn't captutred at source.
So I guess the best you can do here is to prorata the payment value between the items. So for the above, fuel is 12/14 (6/7) of each payment type and the chips and coke fight over the remaining 1/7 depending on their relative sell price (which you will know). So fuel has 6/7 * $10 attributed to Gift Card and 6/7 * $4 Cash.

As far as the fact table design goes, I would split the transaction into three - using payment method to make the transaction code unique. I'd just append 'GC' or 'C' or whatever to the transaction id you already will have just to make it unique.

You can then create multiple simple fact rows - one for each item and payment method combo, with their own sell price, sell date, product, store and payment method dimension references and the modified (unique) transaction id. You'll have 6 in this case - number of items sold * number of payments methods.

The structure becomes very simple. Sum all sales values for the transaction and you get $14. Sum by payment type and you get your $10 / $4 payment method split.

Only complexity here is a bit of work upfront perhaps to split your sales transaction into multiple components - how easy this is depends on what your source data looks like, but easy enough in SSIS or SQL.

LVL 46

Accepted Solution

Kent Olsen earned 2000 total points
ID: 35214499
Hi Ed,

Your fact table appears to contain an entry for each item sold.  I'd certainly look to normalize some of the information.  e.g. the row contains a TicketId value.  Perhaps that should be a foreign key to another dimension table?

Take the same approach with payment information.  Create another fact table with payment data.  Amount, type, etc.  It would share the dimension table "Ticket" with the Sales table.

That will give you a much more flexible design and not clutter up the "Sales" table with payment information.

Good Luck,

Author Comment

ID: 35214704
Hi Tim,

Thanks for the input.  Unfortunately, breaking out the payment would show inaccurate information and, in the case where there may be 2 payment types for 1 item sold, this wouldn't work.  It also seems there would be a lot of duplicated information in the fact table with this theory.  Thanks though!

Hi Kent,

Aha!  That is a great idea!  I didn't think about using the ticket information as a dimension!  

Now, just thinking this through, there are 3 columns that make the ticket unique: Organization(Store), TicketDate and TicketId. One other factor I didn't share is, I plan on partitioning my fact tables on the TicketDateKey. So if I break those 3 columns out into a dimension, how is that going to work for the partitioning of the fact tables?


Author Comment

ID: 35214808
Hi Kent,

I just started my first cup of coffee this morning and didn't think that completely through before I responded.  I will just create a dimension table with those 3 fields and use that to create the foreign key to both fact tables.  Thanks so much for the advice!  This will work perfectly!

LVL 46

Expert Comment

by:Kent Olsen
ID: 35215114
First cup of coffee?  I usually require that just to get out of bed.  :)

Glad that I was able to help,

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Exchange database can often fail to mount thereby halting the work of all users connected to it. Finding out why database isn’t mounting is crucial and getting the server back online. Stellar Phoenix Mailbox Exchange Recovery is a champion product t…
In this article, we will see two different methods to recover deleted data. The first option will be using the transaction log to identify the operation and restore it in a specified section of the transaction log. The second option is simpler and c…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
The Relationships Diagram is a good way to get an overall view of what a database is keeping track of. It is also where relationships are defined. A relationship specifies how two tables connect to each other. As you build tables in Microsoft Ac…

619 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question