sqlcurious
asked on
need to archive data for every 6 months
Hi Experts,
I need to archive data from a table(Original Table ) into a new table(Archive Table) every six months.
For example: Archive table after year should have 1 year data + past 6 months data
So how should I have the query such that it would be picking data dynamically and also checks and makes sure its not duplicating data, please suggest how we need to have the table structure of the archive data ?
So Original table has a jobid which is an unique id. Structure of Original table:
JobID | CustNo | OpenDate | CloseDate
I need to archive data from a table(Original Table ) into a new table(Archive Table) every six months.
For example: Archive table after year should have 1 year data + past 6 months data
So how should I have the query such that it would be picking data dynamically and also checks and makes sure its not duplicating data, please suggest how we need to have the table structure of the archive data ?
So Original table has a jobid which is an unique id. Structure of Original table:
JobID | CustNo | OpenDate | CloseDate
I'd go with a table structured in the same way, just add an ArchiveDate column with the date of archiving.
Your query would look something like this:
You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:
Your query would look something like this:
Insert into JobsArchive
( JobID
, CustNo
, OpenDate
, CloseDate
, ArchiveDate
)
Select a.*
, getdate()
from JobsTable a
where not exists
(select 1
from JobsArchive b
where a.JobID = b.JobID)
You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:
Insert into #JobsArchive
( JobID
, CustNo
, OpenDate
, CloseDate
, ArchiveDate
)
Select a.*
, getdate()
from #JobsTable a
where not exists
(select 1
from #JobsArchive b
where a.JobID = b.JobID)
and a.opendate >= (select max(opendate)
from #jobsArchive c)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks a lot Kvwielink, if I have to use just a query and not a job how would I do it? use dateadd? please suggest?
Hi Sqlcurious,
The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.
Rgds,
Kvwielink
The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.
Rgds,
Kvwielink
Hi Sqlcurious,
The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.
Rgds,
Kvwielink
The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.
Rgds,
Kvwielink
Hi Sqlcurious,
The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.
Rgds,
Kvwielink
The query I posted would just check for any new records in your job table that don't exist in the archive table yet. Not sure what you wish to use Dateadd for.
If you don't use a scheduled job you'd have to run this query manually every 6 months. I'd just schedule a job in the SQL Server Agent so that you don't have to worry about it anymore. Have the job execute every 6 months, and all your new records should be inserted into your archive table without interference.
Rgds,
Kvwielink
ASKER
Hi Kvwielink,
I am having errors in this part, not sure why this isnt working, seems right to me, please help
where not exists
(select 1
from #JobsArchive b
where a.JobID = b.JobID)
and a.opendate >= (select max(opendate)
from #jobsArchive c)
I am having errors in this part, not sure why this isnt working, seems right to me, please help
where not exists
(select 1
from #JobsArchive b
where a.JobID = b.JobID)
and a.opendate >= (select max(opendate)
from #jobsArchive c)
What's the error you're getting?
ASKER
thanks
Your query would look something like this:
Open in new window
You can schedule a job every 6 months that executes this query.
It would be good if besides the "not exists" check you could add a check on dates as well because that would reduce the number of records that have to be read over time. Your archive will grow and grow, and if all records have to be checked each time this will take longer and longer. For instance, if your opendate is chronological, you could add:
Open in new window