?
Solved

Database Design Concept

Posted on 2016-10-14
3
Medium Priority
?
74 Views
Last Modified: 2016-10-14
We have a 4 separate databases for 4 consecutive years (2013, 2014, 2015, 2016). The customers are generally the same, but with more added each year and are identified by a unique customer number. Data changes from year to year.

For example, the unique ID in all 4 data bases is 22 which pulls up Jane. In 2013 she wasn't married, in 2014 she was and in 2015 she had a different phone number and finally there were no changes form 2015 to 2016.

Year     ID           Name                        Phone
2013     22          Jane Doe                   212-555-1212
2014     22          Jane Smith                212-555-1212
2015     22          Jane Smith                646-555-1212
2016     22          Jane Smith                646-555-1212

The goal is to store all data indefinitely and be able to search the history.

I'm guessing the best thing to do is merge the databases together into one? If that's the case, what is the best way to design the database? I'm open to other ideas.
0
Comment
Question by:Sean Holloway
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 44

Expert Comment

by:zephyr_hex (Megan)
ID: 41844123
Does the data have to be categorized by year, or do you only need to track changes?  I.E.  If Jane's info didn't change at all between 2013 and 2016, do you still need a record for each year?

If not, then one way to handle this would be to use one table, and add a IsCurrent boolean / bit column, a DateEffective date column and a DateExpired date column.
When something changes, expire the current record, and create a new record.  The most recent record would have IsCurrent = true.  Historical records would have IsCurrent = false and DateExpired = date before the day when IsCurrent was set to false.  EffectiveDate would be the date that the record was created.

Example:
On 10/1/2016, Jane became a new customer.

UniqueID      CustomerID     Name          DateEffective    IsCurrent    DateExpired
 123                22                      Jane Doe     10/1/2016          true             NULL

On 10/5/2016, Jane changed her last name
UniqueID       CustomerID     Name          DateEffective    IsCurrent    DateExpired
123                  22                       Jane Doe     10/1/2016          false            10/4/2016
783                  22                       Jane Smith   10/5/2016        true            NULL


Unique ID would be an IDENTITY(1,1) column, set as Primary Key.
0
 

Author Comment

by:Sean Holloway
ID: 41844149
It does not need to be categorized by year, I like your concept. I did forget to mention, I have about 25,000 records a year (100,000 records over all years).

UniqueID       ID           Name                        Phone                    DateEffective      IsCurrent    DateExpired
123                  22          Jane Doe                   212-555-1212         1/1/2013               false           12/31/2013
456                  22          Jane Smith                212-555-1212                                        false           12/31/2014
789                  22          Jane Smith                646-555-1212                                        false           12/31/2015
1011                22          Jane Smith                646-555-1212                                        true

What is the best way to compare and update the records like so in batch?
0
 
LVL 44

Accepted Solution

by:
zephyr_hex (Megan) earned 2000 total points
ID: 41844294
The basic concept is borrowed from a data warehouse technique called Slowly Changing Dimension (SCD).  There are various types of changes -- what I've described above is a Type 2 change.

SSIS has a built-in SCD tool that handles the comparisons / updates / new record creation:

https://blogs.msdn.microsoft.com/karang/2010/09/29/slowly-changing-dimension-using-ssis/

I use an alternative SSIS tool (which is faster and easier to maintain):

https://dimensionmergescd.codeplex.com/

There are some other alternatives to the native SSIS SCD tool:

http://www.statslice.com/slowly-changing-dimensions-in-ssis
0

Featured Post

Get proactive database performance tuning online

At Percona’s web store you can order full Percona Database Performance Audit in minutes. Find out the health of your database, and how to improve it. Pay online with a credit card. Improve your database performance now!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
In this series, we will discuss common questions received as a database Solutions Engineer at Percona. In this role, we speak with a wide array of MySQL and MongoDB users responsible for both extremely large and complex environments to smaller singl…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question