Link to home
Start Free TrialLog in
Avatar of sqlcurious
sqlcuriousFlag for United States of America

asked on

need to archive data for every 6 months

Hi Experts,
I need to archive data from a table(Original Table ) into a new table(Archive Table) every six months.
For example: Archive table after year should have 1 year data + past 6 months data
So how should I have the query such that it would be picking data dynamically and also checks and makes sure its not duplicating data, please suggest how we need to have the table structure of the archive data ?
So Original table has a jobid which is an unique id. Structure of Original table:
JobID | CustNo | OpenDate | CloseDate
Avatar of Koen Van Wielink
Koen Van Wielink
Flag of Netherlands image

I'd go with a table structured in the same way, just add an ArchiveDate column with the date of archiving.
Your query would look something like this:

Insert into JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select	a.*
, getdate()
from JobsTable a
where	not exists
	(select	1
	from	JobsArchive b
	where	a.JobID = b.JobID)

Open in new window


You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:

Insert into #JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select		a.*
		,	getdate()
from #JobsTable a
where	not exists
	(select	1
	from	#JobsArchive b
	where	a.JobID = b.JobID)
and	a.opendate >= (select max(opendate)
					from #jobsArchive c)

Open in new window

I'd go with a table structured in the same way, just add an ArchiveDate column with the date of archiving.
Your query would look something like this:

Insert into JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select	a.*
, getdate()
from JobsTable a
where	not exists
	(select	1
	from	JobsArchive b
	where	a.JobID = b.JobID)

Open in new window


You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:

Insert into #JobsArchive
(	JobID
,	CustNo
,	OpenDate
,	CloseDate
,	ArchiveDate
)

Select		a.*
		,	getdate()
from #JobsTable a
where	not exists
	(select	1
	from	#JobsArchive b
	where	a.JobID = b.JobID)
and	a.opendate >= (select max(opendate)
					from #jobsArchive c)

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Koen Van Wielink
Koen Van Wielink
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of sqlcurious

ASKER

Thanks a lot  Kvwielink, if I have to use just a query and not a job how would I do it? use dateadd? please suggest?
Hi Sqlcurious,

The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.

Rgds,

Kvwielink
Hi Sqlcurious,

The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.

Rgds,

Kvwielink
Hi Sqlcurious,

The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.

Rgds,

Kvwielink
Hi Kvwielink,
I am having errors in this part, not sure why this isnt working, seems right to me, please help
where not exists
      (select      1
      from      #JobsArchive b
      where      a.JobID = b.JobID)
and      a.opendate >= (select max(opendate)
                              from #jobsArchive c)
What's the error you're getting?
thanks