Link to home
Start Free TrialLog in
Avatar of Martin Griffiths
Martin Griffiths

asked on

Freeing up disk space by deleting records in a SQL Server 2008 table

I have a data warehouse that stores internet access data for the firm's employees. Obviously this data grows rapidly over time. We've taken the decision that to save virtual server disk space and increase query processing speeds we only really need to keep the last year's worth of data in the data warehouse. The data warehouse is populated via an SSIS task which only adds records created since the last package run i.e. it doesn't truncate the warehouse tables and re-populate.

Am I right in thinking that, by just deleting records, the disk space used won't go down as the space has already been allocated to these deleted records? What about query speed?

I've just done the first pass of these deletes and then done a shrink database to recover the space, but don't want to do this too often really. Appreciate this can cause fragmentation etc but there are no indexes on the tables so is this really an issue? The database is in simple recovery mode by the way and Auto Shrink is False.

Bearing the above in mind, what's the best strategy to use?
SOLUTION
Avatar of microhead
microhead

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jared_S
Jared_S

You are right that you won't be able to reclaim disk space just by deleting records.

I don't think you'll have worry about fragmentation since you don't have indexes.

You could experiment with using a truncateonly parameter in your shrinkdatabase command and see if you can reduce your disk space that way. Truncateonly won't affect any pages, but will still shrink file size.

Something like:

DBCC UPDATEUSAGE
SHRINKDATABASE(N'MyDatabase',<target size>,TRUNCATEONLY)
Avatar of Martin Griffiths

ASKER

microhead,

Are you saying use the execute sql task to delete the records from the destination data warehouse tables as part of my SSIS package run? How is this different to doing a deletion via t-sql as I've done already please? All the SSIS Package is doing as it's currently programmed is, it adds new records from the live web monitoring database on one server (i.e. any internet traffic since yesterday) in to the data warehouse tables on my other server.

The mdf and ldfs used to be around 24GB and 30GB. Look much better now after the database shrink (down to 12GB and 5MB!).

See what you mean with the clustered indexes, but we'll have the problem of having to rebuild indexes if we do a database shrink then won't we?
Jared_S,

My shrink database was done via a maintenance plan shrink database task which translates to what you're suggesting here i.e. a TRUNCATE_ONLY, but without the DBCC UPDATEUSAGE command. Should I be adding in that command do you think?

Either way, if I don't build any clustered indexes on this database in the future, it looks like I'm ok to schedule deletions of web usage records older than a year's old, with a database shrink now and again to keep its size down. Would you agree?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Grffster,

Sorry, there no difference with your T-sql script, i was just saying wat i like to do. Your T-sql works just as well if not better.

I can help you with a script that will reindex automatically if the index needs it. But on databases that size with few indexes, that doesn't take long. I have indexes lager then your database and they take no longer then a few minutes and this can be done online.  

But you mentioned that speed of the query's was the problem. keeping your database small will help but if you want to query data from the last month you will need to load the whole year to check if the data belongs in your query or not, now an index will speed up this portion of the query a hole lot.. you could simple use this code to see the difference:

DECLARE @start_time DATETIME

SET @start_time = GETDATE()

-- my query
SELECT * FROM @search_temp_table

SELECT RTRIM(CAST(DATEDIFF(MS, @start_time, GETDATE()) AS CHAR(10))) AS 'TimeTaken'

Open in new window



after the first run add an index on that table and run again. if there is no gain don't bother with it. But i''l bet you will be surprised.

Grt
A SHRINK with TRUNCATEONLY is ok, although it likely won't free much space, except right after the delete.

Again, the key thing is to delete before you insert the new data, so that the same space can be reused by the new data.
All three contributors have really helped here thanks.