Row versioning data inside a table

Hi guys,

Loooking for view / opinions on the following ( got a debate with someone so thought Id ask everyone here)...

Im going to comapre two ways in which i can see i could version groups of data within a table.

The business need here is that we have forms that have fields which contains values from lookup fields. We are basically creating a table for lookup fields, the requirement is that we should be able to see the easily version these lookups. so when one or more values change we create a new version of our available lookup fields.

Consider the following two tables...

CREATE TABLE [dbo].[Method1](
    [TypeID] [int] IDENTITY(1,1) NOT NULL,
    [TypeName] [nvarchar](100) NOT NULL,
    [Version] [int] NOT NULL,
    [Value] [nvarchar](100) NOT NULL
)

CREATE TABLE [dbo].[Method2](
    [TypeID] [int] IDENTITY(1,1) NOT NULL,
    [TypeName] [nvarchar](100) NOT NULL,
    [ValueID] [int] NOT NULL,
    [Value] [nvarchar](100) NOT NULL,
    [Deleted] [bit] NOT NULL
)

Now in table Method1 data will look like this....

TypeID TypeName Version Value
------- ----------- ---------- -------------------------------------
1      Type 1       1        Some Value 1
2      Type 1       1        Some Value 2
3      Type 2       1        Some Value 1
4      Type 2       1        Some Value 2

To add / remove or even edit values of a given TypeName you create a new version of my TypeName

ie...

TypeID TypeName Version Value
------- ----------- ---------- -------------------------------------
1       Type 1       1       Some Value 1
2       Type 1       1       Some Value 2
5       Type 1       2       Some Value 1
6       Type 1       2       Some Value 2
7       Type 1       2       Some Value 3
3       Type 2       1       Some Value 1
4       Type 2       1       Some Value 2

I kinda see method one as explicitly saying this is my TypeName, i.e. these are its versions and these are its values.

Method2 data looks like this....

TypeID TypeName ValueID Value Deleted
-------- ------------ ---------- --------- ---------
1       Type 1       1       Value 1       0
2       Type 1       2       Value 2       0
3       Type 2       1       Value 1       0
4       Type 2       2       Value 2       0

To add data - create a new row of a given type and new value id

TypeID TypeName ValueID Value Deleted
-------- ------------ ---------- --------- ---------
1       Type 1       1       Value 1       0
2       Type 1       2       Value 2       0
3       Type 2       1       Value 1       0
4       Type 2       2       Value 2       0
5       Type 1       3       Value 3       0

To edit data - create a new row of a given type and reuse the value id

TypeID TypeName ValueID Value Deleted
 -------- ------------ ---------- --------- ---------
1       Type 1       1       Value 1       0
2       Type 1       2       Value 2       0
3       Type 2       1       Value 1       0
4       Type 2       2       Value 2       0
5       Type 1       1       Value 1a      0

To remove data - inset the type name again and mark as deleted

TypeID TypeName ValueID Value Deleted
-------- ------------ ---------- --------- ---------
1       Type 1       1       Value 1       0
2       Type 1       2       Value 2       0
3       Type 2       1       Value 1       0
4       Type 2       2       Value 2       0
5       Type 1       1       Value 1       1

Now i see method two (for better or worse at this stage) as kinda implicit versioning in that you having to rely on the developer knowing that the max type id for typname / valueid pair is the most recent value.

As much as i dislike duplicating data I cant help but feel method one is easier to understand and has benefits in making it easier to report on.

What are peoples views on how to keep versions of groups of data in the table (based on these examples or other methods).

Views much appreciated.
noel58Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

MacNuttinCommented:
The deleted feild is nice because it allow dual use of the table. That is to say you can always rely on deleted flag to be 0 for only one TypeName so JOIN Method2  m2 on xx.other = m2.typename and m2 deleted = 0

Also the Lookup table can show a complete version history.
0
noel58Author Commented:
Hi, yeah the delete column could I guess be applied to both.

When you menntion a complete version history - you regarding to method 1?

I guess this is all really boiling down to if its best to copy all the rows or not for each new version.

Regards
0
MacNuttinCommented:
Ya from what I can see both methods give you a complete version history. Just that method2 is easier to code to get the current verion with the delete flag Method 1 as you pointed out can too but it has to use select MAX(date) where....
0
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

noel58Author Commented:
where...?

The other thing i also considered is if i wanted to see a version at a particular point in time.

Method 1 could have a date time lookup based on the type name & version number

Method 2 would require each and every row in the type table to have a datetime.

I guess the other thing to consider is that i can get a unique clustered index on cols TypeName, Version, Value in method 1 where as i can't in method 2.

Im feeling to go with version 1 even though id have to work out the max version id for the latest version. I feels it clearer on what the data in the table is trying to express.

What mehod would you personally go for?

Regards
0
MacNuttinCommented:
Oh sorry you have no date column select MAX(version) from Method1 where typeName =

My point is you don't want to do that if you have the deleted column - Which in my opinion is better.


I would call it activeFlag or currentStableVersion
0
SharathData EngineerCommented:
We used the second approach in our databse design (Deleted flag). We are comfirtable with this approach. we are dealing with huge volumes (Millions) of data.
In SQL Server 2008, you have the concept of CDC (Change Data Capture). CDC is also uses the same approach. FOr each table, SQL Server will create a CT table which contains the records which are modifed. The field __$operation will have the following values.
1 - Delete
2 - Insert
3 - Old Value (For Update)
4 - New Value (For Update)
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Databases

From novice to tech pro — start learning today.