We help IT Professionals succeed at work.

How to clean out the SharePoint "Content_DB" of all User Content

I have a particular interest in creating an Entire SharePoint structure outside of the Production Environment i have.  I need to build a DEV & TEST environment.  Now, the problem that i am faced with is Hardware resources.  They are very little compared to the Production environment and well, I need to create these Environments.  

The ideal Solution would be to create the TEST systems and duplicate the PRD systems then copy the databases over.  With the limited amount of Hard Disk space, I will need to figure out a way or purchase a tool that will allow me to clean out the Content_DB for the Sites in question.  What i would like to be left with is an entire TEST envrionment that is exaclty like PRD, but minus the user content, basically leaving just the site heiarchy.  Now Microsoft decided to include the User Content and the SharePoint Site Content (Heiarchy) in the same exact "Content_DB" database.  I have created a few SQL scripts that filters out basic SharePoint files from the user content, but this is not 100% clean as some things are still left around.  This does not provide me with a Optimal TEST environment, as it is not 100% clean.  I did get the scripts pretty darn close and exact to cleanout the files.  But we need a better solution.  Now I am hoping that someone else has been in this same predictament and will have the solution that I need or something close to it.  

Any help would be great as I have searched as much as I could.  I am hoping that someone else had better luck at finding this answer.

I do know that the PRD & TEST systems should be aligned 100% exactly identical, however money does play into this decision, it is not an option to spend any money on hardware, but hopefully the cost of a tool would be lower and make this a reality.
Comment
Watch Question

Commented:
I have this issue too. My prod DB is 65GB and it takes forever to copy/restore to dev/UAT (different geographical locations). When it gets to 75GB I'm splitting it out to another DB, but that won't help overall.

You have 2 approaches.

1.- Delete the data from the current DB after doing a restore
Eg: truncate table dbo.AllDocStreams
...dbo.AllDocs
...dbo.AuditData
...dbo.AllUserData

2.- Create a new site then export dbo.Webs, dbo.Sites, dbo.Sites... etc, etc.


I haven't got a script that will do all the work for you, but it's fairly easy to delete > test > delete > test

Commented:
PS: with option 2 you can also use sharepoint designer to create web packages from Prod without the data OR using sharepoint save as template then import 1 by 1.

Author

Commented:
Thank you for your view on the situation.  There is a problem with that theory.  From everything that I can tell Microsoft has done to the Content_DB, it is the fact that the actual web pages that are displayed when you browse a site are stored in the Content_DB.  Therefore, if i were to empty or truncate the dbo.AllDocs database, this would break my site heiarchy.  We have around 150+ sites, so creating them is not an option.  Im looking only to remove the users Content, such as .doc, .docx, .ppt, etc...

I am still looking for a way to do this, however it is a challenge that I wish to see through.

Thank you,
Gary
Commented:
Delete content by list ID.

dbo.AllDocs
dbo.AllUserData

have a tp_list or list id column

so your query would be something like:

delete from dbo.AllUserData where tp_list in ('guid1', 'guid2')