need to archive data for every 6 months

Hi Experts,
I need to archive data from a table(Original Table ) into a new table(Archive Table) every six months.
For example: Archive table after year should have 1 year data + past 6 months data
So how should I have the query such that it would be picking data dynamically and also checks and makes sure its not duplicating data, please suggest how we need to have the table structure of the archive data ?
So Original table has a jobid which is an unique id. Structure of Original table:
JobID | CustNo | OpenDate | CloseDate
sqlcuriousAsked:
Who is Participating?
 
Koen Van WielinkConnect With a Mentor Business Intelligence SpecialistCommented:
I'd go with a table structured in the same way, just add an ArchiveDate column with the date of archiving.
Your query would look something like this:

Insert into JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select	a.*
, getdate()
from JobsTable a
where	not exists
	(select	1
	from	JobsArchive b
	where	a.JobID = b.JobID)

Open in new window


You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:

Insert into #JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select		a.*
		,	getdate()
from #JobsTable a
where	not exists
	(select	1
	from	#JobsArchive b
	where	a.JobID = b.JobID)
and	a.opendate >= (select max(opendate)
					from #jobsArchive c)

Open in new window

0
 
Koen Van WielinkBusiness Intelligence SpecialistCommented:
I'd go with a table structured in the same way, just add an ArchiveDate column with the date of archiving.
Your query would look something like this:

Insert into JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select	a.*
, getdate()
from JobsTable a
where	not exists
	(select	1
	from	JobsArchive b
	where	a.JobID = b.JobID)

Open in new window


You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:

Insert into #JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select		a.*
		,	getdate()
from #JobsTable a
where	not exists
	(select	1
	from	#JobsArchive b
	where	a.JobID = b.JobID)
and	a.opendate >= (select max(opendate)
					from #jobsArchive c)

Open in new window

0
 
Koen Van WielinkBusiness Intelligence SpecialistCommented:
I'd go with a table structured in the same way, just add an ArchiveDate column with the date of archiving.
Your query would look something like this:

Insert into JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select	a.*
, getdate()
from JobsTable a
where	not exists
	(select	1
	from	JobsArchive b
	where	a.JobID = b.JobID)

Open in new window


You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:

Insert into #JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select		a.*
		,	getdate()
from #JobsTable a
where	not exists
	(select	1
	from	#JobsArchive b
	where	a.JobID = b.JobID)
and	a.opendate >= (select max(opendate)
					from #jobsArchive c)

Open in new window

0
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

 
sqlcuriousAuthor Commented:
Thanks a lot  Kvwielink, if I have to use just a query and not a job how would I do it? use dateadd? please suggest?
0
 
Koen Van WielinkBusiness Intelligence SpecialistCommented:
Hi Sqlcurious,

The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.

Rgds,

Kvwielink
0
 
Koen Van WielinkBusiness Intelligence SpecialistCommented:
Hi Sqlcurious,

The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.

Rgds,

Kvwielink
0
 
Koen Van WielinkBusiness Intelligence SpecialistCommented:
Hi Sqlcurious,

The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.

Rgds,

Kvwielink
0
 
sqlcuriousAuthor Commented:
Hi Kvwielink,
I am having errors in this part, not sure why this isnt working, seems right to me, please help
where not exists
      (select      1
      from      #JobsArchive b
      where      a.JobID = b.JobID)
and      a.opendate >= (select max(opendate)
                              from #jobsArchive c)
0
 
Koen Van WielinkBusiness Intelligence SpecialistCommented:
What's the error you're getting?
0
 
sqlcuriousAuthor Commented:
thanks
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.