Huge transaction log/erroneous Modified time

Good morning,
We have an ongoing problem with one content database (there are several others that are unaffected): the transaction log is enormous and the bandaid is to add space to the backup drive (yes, I can hear you choking on your coffee, please let that slide for now).

Here's what I know:
IT Site Collection > Projects  sub site > Site Contents
All subsites of the Projects site are showing as  "Modified 3 minutes ago".  
Depending on when I look, this value may be 1 or 2 or 3, up to 6 minutes ago.

IT Site Collection > Projects  sub site > Site Contents > Child site > Site Contents
Documents doc lib shows Modified 3 minute ago
No other lists or apps are listed as modified sooner than 2 weeks ago.

IT Site Collection > Projects  sub site > Site Contents > Child site > Site Contents > Documents doc lib (not settings)
Nothing was modified sooner ago than 43 minutes

IT Site Collection > Site Contents
AllITDocuments doc lib was modified 2 minutes ago.
The most recently edited file was about an hour ago.
IT Site Collection > Site Contents > a custom list (with 4 items in it) shows as modified 3 minutes ago.
The most recent change to this list was in 2013.

Our DBA ran an audit that showed that the doc lib in italics above is showing up in an Event Cache list as having 15,189 recent actions performed on it, compared with from 1 to 5000 or so on other lists.  

Something is making changes to the database every minute or so, but we can't query the db to find out because we don't know what to look for.

Has anyone seen anything like this and how do I fix it?  Our bandaid is obviously unsustainable and we need this mystery solved.

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Brian CroweDatabase AdministratorCommented:
Can you provide some more objective descriptions of your transaction log and database sizes than "enormous"?  I assume the recovery model on this database is FULL but how often are you performing a transaction log backup?

Below is a query that will help you find frequently run procedures which may or may not help locate the processes that are updating your tables depending on what else is going on in your system.

        ,query_text = SUBSTRING(
                    (CASE WHEN qs.statement_end_offset = -1 
                        THEN len(CONVERT(nvarchar(MAX), qt.text)) * 2 
                        ELSE qs.statement_end_offset 
                        END - qs.statement_start_offset)/2) 
        ,dbname = db_name(qt.dbid)
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
ORDER BY qs.execution_count DESC

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
nsyyoungAuthor Commented:
From my DBA's email:
There is something happening on the GFIT site that is making a lot of database changes, and I don’t think it’s user activity.  This morning we had a 1.8 GB log backup.  The log backups stop on Saturday night and resume Monday morning.  This represents a whole day’s activity.  To put it in perspective, the full backup is only 6.1 GB.  

The only database that has similar activity is the crawl store database after a full crawl.  GF Portal/Finance is 28GB and only changes 100 – 200 MB.

So, what’s running in the GFIT Site that is changing so much data?

Does that help?  (I don't speak SQL very well.)
nsyyoungAuthor Commented:
Yes, full recovery, TLog backup every hour.  I'll get back to you on the query.
Active Protection takes the fight to cryptojacking

While there were several headline-grabbing ransomware attacks during in 2017, another big threat started appearing at the same time that didn’t get the same coverage – illicit cryptomining.

nsyyoungAuthor Commented:
The object_name function gets us a little more information.  I’m not sure what the time span for the execution count is, so it’s difficult to put that into context.   But here’s what we see:  The most executed procedures that would effect a change in the database are:


The names of all of these have to do with site quota.  Is something set up for quotas in GFIT that isn’t in other sites?

There is also a procedure named “proc_UpdateListFields” that appears to update the EventCache table we looked at yesterday.

There are no special quotas set on this site collection or any other company site collections.   There is a 100MB quota in place on personal sites only.
Brian CroweDatabase AdministratorCommented:
A 1.8GB log file in itself isn't alarming.  I have several databases in our production OLTP environment that are significantly larger than that with a 30 minute increment on log backups.  I imagine what you are responding to is a marked increase in database activity.

Do you have any kind of monitoring software (Solarwinds, RedGate, Apex, etc.) that you can utilize?  Were there any other system changes that might correspond to when you saw things change?
nsyyoungAuthor Commented:
That would be 1.8GB of changes when no one is working (i.e., Sunday), so yes, it is alarming that such a marked increase in db activity would be occurring.

We have LogStash, Solar Winds, yes.  LS crashed yesterday, but we're waiting for a report from that.

We do have more information.  There is a custom site template in use on MOST project subsites.  On the sites that don't use the custom template, there is no minute-by-minute updating.  If we create a new doc library and delete the existing one, then the problem goes away, or at least it's not updating every minute.  I have yet to ask the DBA if it matters there.  

If that does matter according to the DBA, then ideally we'd like to fix the site template in use.  Is that even possible?  Barring that, can we change the site template ID to one that isn't broken?

Questions abound.
Vitor MontalvãoMSSQL Senior EngineerCommented:
Check for maintenance jobs as data purge or reindexes. These kind of jobs can origination big transaction logs.
nsyyoungAuthor Commented:
Brian, it was your answer that got us on the right track to solving it, and Vitor, you were very close.  Thanks.
Vitor MontalvãoMSSQL Senior EngineerCommented:
So, what was the cause of the issue?
nsyyoungAuthor Commented:
Uhhhh, I wrote out a whole explanation, which I apparently obliterated with my later comment.

Here's a better explanation from the creator of the mischief:
Back when we created the GFIT site, we had decided that more metadata was needed to keep the document library from becoming the disorganized cluster that is my kitchen junk drawer and the shared drives.  So we created these metadata fields.  Then we populated these fields with lists of possible choices to keep people from typing in similar but uselessly different metadata like “SharePoint” v. “Share Point”, or “SQL vs SQL Server 2008 R2”.  These standard metadata values were put into lists and the choice box columns were populated with a lookup from those lists.  To update the choice box columns when someone added a value to the list, I wrote a powershell script that would sync the column choices with the list.  This script ran every 5 minutes.  

This update process was not noticeable when there were only 3 or 4 document libraries, but now that there are 60 + the effect was significantly bigger.  The database was making 50+ MB of changes every hour.  

I stopped the scheduled task from running the power shell script.   That stopped the unnecessary activity.

We're going to change it to a wf that fires on New or Changed Item and starts that task to run once, and add this to our Lessons Learned list.
Vitor MontalvãoMSSQL Senior EngineerCommented:
A powershell script triggered by a scheduled task. Hard to remember and harder to track if the people that built the process aren't there anymore.

Thanks for the explanation.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SharePoint

From novice to tech pro — start learning today.