Link to home
Start Free TrialLog in
Avatar of nsyyoung
nsyyoungFlag for United States of America

asked on

Huge transaction log/erroneous Modified time

Good morning,
We have an ongoing problem with one content database (there are several others that are unaffected): the transaction log is enormous and the bandaid is to add space to the backup drive (yes, I can hear you choking on your coffee, please let that slide for now).

Here's what I know:
One
IT Site Collection > Projects  sub site > Site Contents
All subsites of the Projects site are showing as  "Modified 3 minutes ago".  
Depending on when I look, this value may be 1 or 2 or 3, up to 6 minutes ago.

IT Site Collection > Projects  sub site > Site Contents > Child site > Site Contents
Documents doc lib shows Modified 3 minute ago
No other lists or apps are listed as modified sooner than 2 weeks ago.

IT Site Collection > Projects  sub site > Site Contents > Child site > Site Contents > Documents doc lib (not settings)
Nothing was modified sooner ago than 43 minutes

IT Site Collection > Site Contents
AllITDocuments doc lib was modified 2 minutes ago.
The most recently edited file was about an hour ago.
IT Site Collection > Site Contents > a custom list (with 4 items in it) shows as modified 3 minutes ago.
The most recent change to this list was in 2013.

Two
Our DBA ran an audit that showed that the doc lib in italics above is showing up in an Event Cache list as having 15,189 recent actions performed on it, compared with from 1 to 5000 or so on other lists.  

Something is making changes to the database every minute or so, but we can't query the db to find out because we don't know what to look for.


Has anyone seen anything like this and how do I fix it?  Our bandaid is obviously unsustainable and we need this mystery solved.

Thanks,
nsy
ASKER CERTIFIED SOLUTION
Avatar of Brian Crowe
Brian Crowe
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nsyyoung

ASKER

From my DBA's email:
There is something happening on the GFIT site that is making a lot of database changes, and I don’t think it’s user activity.  This morning we had a 1.8 GB log backup.  The log backups stop on Saturday night and resume Monday morning.  This represents a whole day’s activity.  To put it in perspective, the full backup is only 6.1 GB.  

The only database that has similar activity is the crawl store database after a full crawl.  GF Portal/Finance is 28GB and only changes 100 – 200 MB.

So, what’s running in the GFIT Site that is changing so much data?

Does that help?  (I don't speak SQL very well.)
Yes, full recovery, TLog backup every hour.  I'll get back to you on the query.
The object_name function gets us a little more information.  I’m not sure what the time span for the execution count is, so it’s difficult to put that into context.   But here’s what we see:  The most executed procedures that would effect a change in the database are:

proc_ProcessSiteQuotaForStorageMetricsChanges
proc_QMChangeSiteDiskUsedAndContentTimestamp
proc_AppendSiteQuota

The names of all of these have to do with site quota.  Is something set up for quotas in GFIT that isn’t in other sites?


There is also a procedure named “proc_UpdateListFields” that appears to update the EventCache table we looked at yesterday.

There are no special quotas set on this site collection or any other company site collections.   There is a 100MB quota in place on personal sites only.
A 1.8GB log file in itself isn't alarming.  I have several databases in our production OLTP environment that are significantly larger than that with a 30 minute increment on log backups.  I imagine what you are responding to is a marked increase in database activity.

Do you have any kind of monitoring software (Solarwinds, RedGate, Apex, etc.) that you can utilize?  Were there any other system changes that might correspond to when you saw things change?
That would be 1.8GB of changes when no one is working (i.e., Sunday), so yes, it is alarming that such a marked increase in db activity would be occurring.

We have LogStash, Solar Winds, yes.  LS crashed yesterday, but we're waiting for a report from that.

We do have more information.  There is a custom site template in use on MOST project subsites.  On the sites that don't use the custom template, there is no minute-by-minute updating.  If we create a new doc library and delete the existing one, then the problem goes away, or at least it's not updating every minute.  I have yet to ask the DBA if it matters there.  

If that does matter according to the DBA, then ideally we'd like to fix the site template in use.  Is that even possible?  Barring that, can we change the site template ID to one that isn't broken?

Questions abound.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Brian, it was your answer that got us on the right track to solving it, and Vitor, you were very close.  Thanks.
So, what was the cause of the issue?
Uhhhh, I wrote out a whole explanation, which I apparently obliterated with my later comment.

Here's a better explanation from the creator of the mischief:
Cause:
Back when we created the GFIT site, we had decided that more metadata was needed to keep the document library from becoming the disorganized cluster that is my kitchen junk drawer and the shared drives.  So we created these metadata fields.  Then we populated these fields with lists of possible choices to keep people from typing in similar but uselessly different metadata like “SharePoint” v. “Share Point”, or “SQL vs SQL Server 2008 R2”.  These standard metadata values were put into lists and the choice box columns were populated with a lookup from those lists.  To update the choice box columns when someone added a value to the list, I wrote a powershell script that would sync the column choices with the list.  This script ran every 5 minutes.  

Symptoms:
This update process was not noticeable when there were only 3 or 4 document libraries, but now that there are 60 + the effect was significantly bigger.  The database was making 50+ MB of changes every hour.  

Solution:  
I stopped the scheduled task from running the power shell script.   That stopped the unnecessary activity.

We're going to change it to a wf that fires on New or Changed Item and starts that task to run once, and add this to our Lessons Learned list.
A powershell script triggered by a scheduled task. Hard to remember and harder to track if the people that built the process aren't there anymore.

Thanks for the explanation.