Where Clause For Archive Table

I have 5 yrs records in the production table consisting of 2013 records, 2014 records, 2015 record, 2016 records,
and 2017 records and I need to move the past 4yrs records to the Archive table and also delete from the production table
after moving them to the Archive table, leaving 1 year records in the production table.

My question:
How do I achieve this in the Where Clause by moving these records nightly in batches of 1 yrs each night.
Can you give me the script.
 

e.g
insert into my table (col, col2,....)
select into my table (col, col2,....)
FROM [dbo].[CandidateHist]
WHERE (CandidateTS < dateadd(wk,-12,current_TimeStamp))


Delete FROM [dbo].[CandidateHist]
WHERE (CandidateTS < dateadd(wk,-12,current_TimeStamp))
faith victoryAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
First of all, you need to use a datetime variable to keep the point in time used for both commands in sync, otherwise you'll delete more than what you have archived (as time progresses).
I would also make sure to cut off the time portion of that date.
begin tran
declare @date date = dateadd(wk, -12, getdate())

insert into mytable (col, col2,....)
FROM [dbo].[CandidateHist]
WHERE CandidateTS < @date

Delete FROM [dbo].[CandidateHist]
WHERE CandidateTS < @date
commit

Open in new window

yo_beeDirector of Information TechnologyCommented:
I would do something like this that is slightly different, but gets the same results.

Select * into ArchiveTbl 
FROM [dbo].[CandidateHist]
Where  Datediff(day, CandidateTS, getdate()) > 365;

delete  [dbo].[CandidateHist]
Where  Datediff(day, CandidateTS, getdate()) > 365;

Open in new window

Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
Bad idea, because it uses a function on a database column in the where clause. If you have a constant date to check against, it is best to put any calculation into that date constant ;-).
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

yo_beeDirector of Information TechnologyCommented:
Thanks for that info. I assume that on a small set of data you will not see the performance hit, but if the data set is large that is when you can start seeing this.
yo_beeDirector of Information TechnologyCommented:
What if you used day ver. weeks.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
The issue is that the DBMS cannot use an index as soon as you use dateadd, datediff etc.. Using datediff hence will most likely result in a full table scan.
faith victoryAuthor Commented:
You guys are awesome! The quick responses are impressive.


Please can one of you give or point  me to the best script without huge performance impact.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
MIne is the best, of course!
Besides the function use issue, you can apply SELECT INTO only if the target table does not exist yet. You certainly don't want to create a new archive table each day (or night).
yo_beeDirector of Information TechnologyCommented:
I would also run both and see what works for you.

Both are methods that get you the desired results.

I find my method to insert into a works with the least amount of coding because you do not have to build the insert into (col1,col2,col3, .....).
My where clause might not be the best,but it is an option.

Maybe build a script that builds both suggestions into a single statement.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
SELECT INTO does not (re)create any relational metainfo like indexes or constraints. It is only used for temporary tables, or table copies which then receive primary keys etc. by additional scripting.

Creating 365 tables a year is worse than having to enumerate columns . A select using more than one day would become a nightmare ...
Yes, having to enlist columns bears some risk of data loss if you do not keep the column list of source and target tables the same, and either build the INSERT dynamically to always contain all columns, or do that manually. If the tables can get changed in structure, it iis important to have some checks (like same column count in both tables, dynamically creating the column list or comparing table with insert column count, ...) built in. But that is a completely different topic.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
yo_beeDirector of Information TechnologyCommented:
Fair statement.
Scott PletcherSenior DBACommented:
It will be a lot faster to create all the backups, in whatever size batches you want, then rather than DELETE rows in the table, save the rows from 2017 on into another, temporary table, truncate the main table, and copy the current year back in.

You should archive all old data based on the clustering key of the table.  Often that's an identity column (btw, far too often, sadly for overall performance).  Whatever the clus key is, use it to control the archive process.  You can also verify the counts of rows in the main table vs the count of rows archived.

If you'll provide the clustering key column(s), I can provide actual code for this method.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
Well Scott, if the archiving is done daily, it is an average rate of 1:364 rows to delete. At that rate I doubt it is faster to do the export/truncate/import. I totally agree if more rows are to delete than to keep. But as always, "it depends".
Scott PletcherSenior DBACommented:
@Qlemo:

But the original q stated:
"
I have 5 yrs records in the production table consisting of 2013 records, 2014 records, 2015 record, 2016 records,
and 2017 records and I need to move the past 4yrs records to the Archive table and also delete from the production table
after moving them to the Archive table, leaving 1 year records in the production table.
"

I don't see how "it depends", unless somehow 4 yrs of history are no not much more than 1 yr of current??  NOT likely.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
You are correct, somehow I had running a daily job in mind, But that isn't what has been asked for: "How do I achieve this in the Where Clause by moving these records nightly in batches of 1 yrs each night." is like you tell.

So I agree - copy all data to new table, truncate original table, then reimport 2017 data. Whether to use batches (1 year, month or whatever) does not matter much if only applied to the "copy all data".

The fastest operation however would be to rename the original table, recreate the original table and insert 2017 data into it from the archive. Then delete 2017 rows in the archive table.
faith victoryAuthor Commented:
@Qlemo last comment,

How I can do this in a live production table with applications referencing it for reads?
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
You need to find a short time slot without access. Rename and recreate is a quick operation done within a few seconds. It should even work while the table is in read-only use, but I'm not certain.

SSMS supports immediate rename on the table name directly, or you use
   sp_rename 'orgTbl', 'archTbl'
Before you do, use SSMS to generate the CREATE script for the table (make sure to include index and constraints etc.), then rename as described, then run the CREATE script ASAP.

Now you can perform your data operations:
  insert into orgTbl select * from archTbl where yourdate >= '20170101'
and, after having checked everything is fine with that,
  delete from archTbl where yourdate >= '20170101'
faith victoryAuthor Commented:
Everyone was very helpful
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Development

From novice to tech pro — start learning today.