troubleshooting Question

staging model: How do I retain integrity and assure uniqueness?

Avatar of Paula DiTallo
Paula DiTalloFlag for United States of America asked on
C#Microsoft SQL Server 2008
3 Comments1 Solution237 ViewsLast Modified:
Techies--
 
I need to model a staging database that is used to process incoming xml files from one vendor and api data from another.  The incoming records per load can contain all unique header/detail_line/sub_detail_line data, the same record again with updates to elements, or the same record with an added/removed detail_line/sub_detail_line. I do not, or will not know which.

How do model this? Do I use something like an added_row_timedate column to capture the uniqueness of an instance of a record/data object along with the obvious keys?  Do I create concatenated primary keys that include the added_row_timedate column? Do I not do that and generate synthetic/sequential identifier keys? Do I bother with keys in the staging environment at all, or do I enforce integrity through unique constraints?

The code snippet has a sample XML stream and the obvious portion of the model. Please advise me.


--<AllData>
--<DataObj>
--  <HeaderObj>
--     <hdr_id>1000</hdr_id>
--     <hdr_elem_2>my header element</hdr_elem_2>
--     <DetailObj>
--        <dtl_id>10</dtl_id>
--        <dtl_elem_2>my detail line element</dtl_elem_2>
--        <SubDetailObj>
--          <sub_id>1</sub_id>
--          <sub_elem_2> my subdetail line element</sub_elem_2>
--        </SubDetailObj>
--     </DetailObj>
--  </HeaderObj>
--</DataObj>
--<DataObj>
--  <HeaderObj>
--     <hdr_id>1010</hdr_id>
--     <hdr_elem_2>my other header element</hdr_elem_2>
--     <DetailObj>
--        <dtl_id>10</dtl_id>
--        <dtl_elem_2>my other detail line element</dtl_elem_2>
--        <SubDetailObj>
--          <sub_id>1</sub_id>
--          <sub_elem_2> my other subdetail line element</sub_elem_2>
--        </SubDetailObj>
--     </DetailObj>
--  </HeaderObj>
--</DataObj>
--<DataObj>
--  <HeaderObj>
--     <hdr_id>1000</hdr_id>
--     <hdr_elem_2>my header element</hdr_elem_2>
--     <DetailObj>
--        <dtl_id>10</dtl_id>
--        <dtl_elem_2>my detail line element</dtl_elem_2>
--        <SubDetailObj>
--          <sub_id>1</sub_id>
--          <sub_elem_2> my subdetail line element</sub_elem_2>
--        </SubDetailObj>
--     </DetailObj>
--        <dtl_id>20</dtl_id>
--        <dtl_elem_2>my new detail line element</dtl_elem_2>
--     </DetailObj>
--         <SubDetailObj>
--         </SubDetailObj>
--  </HeaderObj>
--</DataObj>
--</AllData>

create table [academic].[load_xaction_data]
( LoadId int not null,          -- sequential identifier/synthetic key
  ImportSourceId int not null,  -- coming from another table that tracks whether source is api or xml files
  ImportDate datetime not null,
  LoadBeginTime datetime not null,
  LoadEndTime datetime not null);
  
create table [academic].header_obj
 ( hdr_id int not null,                  -- retained exisiting data object key
   hdr_element_2 varchar(50) not null ); -- other header element
   
create table [academic].detail_obj
( hdr_id int not null,					-- retained existing data object key from header
  dtl_id int not null,					-- retained existing data object key from detail
  dtl_elem_2 varchar(50) not null);		-- other detail header element
  
create table [academic].sub_obj
( hdr_id int not null,				    -- same pattern
  dtl_id int not null,
  sub_id int not null,
  sub_elem_2 varchar(50) not null);
ASKER CERTIFIED SOLUTION
Christopher Kile
Senior Software Analyst

Our community of experts have been thoroughly vetted for their expertise and industry experience.

Join our community to see this answer!
Unlock 1 Answer and 3 Comments.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 3 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros