Avatar of websss
websss
Flag for Kenya asked on

Do I need to match RAM size with DB size? and how to only cache based on date?

Hi

I have a 25GB database
We have web app which savings vehicle tracking data
We are on boarding lots more customers, and with this, the database sets to grow into 100's of gig's

Thats really expensive when it comes to RAM with the data centre

The main reason for the growth is we need to keep the last 12 months data, so this really makes the DB grow in size
The reality is, most people only care about the last 1 month, but we do need to keep the last 12 months for accident data etc

So my question is in relation to RAM and DB size and just caching the last months worth of data from 1 massive table
I've looked into SQL server 2016 stretch DB, but we aren't there yet, so its not an option
Someone mentioned partitioning on dates, but not really sure what that means
Others also mentioned SQL is clever and does this for you, but we do notice a slow down when the memory on the server is on the roof.
And we are seeing growth rate of 200MB per day

we've just run out of ram on the server again
Do i just need to endlessly throw RAM and an ever growing database, or is there a more efficient way to only cache the latest data in memory?
we are using sql server 2016
Microsoft SQL Server

Avatar of undefined
Last Comment
Vitor Montalvão

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
Vitor Montalvão

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Mal Osborne

Not sure if you have "run out of RAM" per se. If SQL reads some data from disk, it will keep it cached in memory for ever, provided RAM is available. If a heap of data was read in March, it will still be in RAM in April, unless SQL is restarted, or another application needs RAM. Thus, if you have a 25Gb database, once every record has been read at least once, it will try to keep the whole lot cached.

This makes sense; hard drive is MUCH slower than RAM, so if RAM is not needed for anything else, SQL might as well use it. You need to make a measure of performance, and decide on a "sweet spot". If most of the data read is in the last month, then enough RAM to cache a month of data should be adequate. The rest is not really going to do anything useful.
websss

ASKER
Thanks Guys

Actually the "Partitioned Table and Index Concepts" is very helpful

In my case, the users are usually just interested in the 30 days of data, so does that mean if sql is running for 3 months, its going to pointlessly be caching month 3, and using valuable RAM when it should all be allocated to the last 30 days of data?
Vitor Montalvão

In my case, the users are usually just interested in the 30 days of data, so does that mean if sql is running for 3 months, its going to pointlessly be caching month 3, and using valuable RAM when it should all be allocated to the last 30 days of data?
Correct.
You can create a "live" partition to store the last 30 days, then, for example, a "quarter" partition to store the last 90 days, a "year" partition to store the last 12 months and an "archive" partition to store data older than one year.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
websss

ASKER
also, with the above in mind, should i be restarting SQL to clear its cache, and then run something that caches the last 30 days data?

Open to suggestions on how to do this properly!!
websss

ASKER
Hi Vitor

Thank you for this info, i've been reading a lot, and seems quite an involved subject

So the Live, quarter and year partition are all Table Partitioning?
We constantly new data coming in, about 10 records a second, but this could scale to 100 a second.

Users will also want to see the latest record for where their ASSET is
Do you foresee this as being and issue?
Mal Osborne

No need to restart SQL. It is usually pretty good at figuring out what data to keep in memory. If, most of the data accessed is recent. then that data will almost always be cached when needed. Unless you set a cap, or some other process wants RAM, SQL just keeps everything read from disk in memory. A slowdown when RAM is nearly all in use could be due to other processes having less to play with. Maybe peg use at half of total RAM and see how that goes?
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Vitor Montalvão

So the Live, quarter and year partition are all Table Partitioning?
Yes. That's my suggestion based on what you've been told us in this question so do not take it as granted but you can start working from this suggestion and tweak it as you need.

Users will also want to see the latest record for where their ASSET is
 Do you foresee this as being and issue?
No issues at all. The data it will be there in one partition or other and this is very transparent for the user. What partitioning does is to separate physically the records so the database engine will deal only with the information that matters improving the performance.