SQL Server DB Size

We have a vb.net tool that opens up 70+ workbooks and stores the contents into a sql database every month.  

To give ya'll an idea, we're talking about 400 Excel columns for each Excel row (right now there are 2000 rows) for each month.  Most of these are columns are numeric (about 30 are strings less than 255 characters).  

Our problem is that the size of the database grows very rapidly.  With about 60 months stored we're at 67,937,984 kb.  My questions:

1. Given the amount of data we are storing, does this seem like reasonable growth?  

2. Is there something I'm doing wrong with the config/setup of database? I periodically shrink the database and it does help a little but the significant growth each month is still there?

3. Should I consider splitting the data files into multiple physical files ..... what are the benefits?

I know .... loaded questions and I am just starting to try and tackle this - any insight would be appreciated.  

rescapacctgitAsked:
Who is Participating?
 
Anthony PerkinsCommented:
You have a choice:
1.  Continue to import the data from Excel as you are currently doing and be prepared to have a lot of disk space available or
2.  Spend some time to analyze and see how you can normalize the data.  This should give you better performance and certainly less disk space.

What ever you decide, you need to take a serious look at the data types you are using.  You have basically chosen the lowest common denominator as in worst performing and/or biggest data type.

float: Should never be used unless you are involved with an exponential precision.  It is an approximate numeric data type, so instead of 1.15 you may end up with 1.149999999999.  I suspect that is not what you want.  Instead use a fixed length numeric (tinyint, smallint, integer, bigint, numeric) and you will have the bonus of a smaller data type.

datetime:  Do you need second precison?  If not use smallint.

int:  Make sure it is not overkill.  For example, does "monthdate" really have to be int?  Do you expect to need a range of 2 billion?  The same goes for "StartRunFromMonth" and I am sure there are many others.

varchar(max):  While useful, it has its drawbacks and do you really need to support a FolderPath of 2 billion characters?

What is the point of naming your columns [2] - [52]?  Surely something like Week2 - Week52 would be more appropriate.  The same goes for [1] - [12] It will also save you haveing to refer to is in [] everytime.
Or even better instead of using [2] - [52] how about pivoting the table to make those rows?

I could go on, but you get the drift.  The bottom line is that I would recommend you hire a SQL Server consultant to fix that mess or continue as is.  Your choice.
0
 
Scott PletcherSenior DBACommented:
Hmm, let's see.

~400 columns * ~2000 rows * ~60 months = ~48,000,000 values

So each value is taking (over) 1000 bytes to store?

That does seem very excessive, if that is the actual number of values you have.

Are you storing the data as SQL columns or as some type of varbinary?

0
 
LuisGuerreroCommented:
could you post the table definition ? it could help us understand why your db is so big
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Anthony PerkinsCommented:
The problem I suspect is that you are importing the data from Excel as is and not bothering to normalize the data or even bother to fix the data types, hence the rapid growth.
0
 
rescapacctgitAuthor Commented:
Sorry, it took me a while to figure out exactly why we were doing things a certain way.

Table Definition is pretty basic, as acperkins guessed - not normalized at all.  This is because we are literally slapping the data from Excel into SQL Server as it is in Excel.

Our problem is that we need to store ALOT of data quickly and then run a few stored procedures and spit out a result set.

If we iterate through each row (or even each column) in the excel workbook VB.Net takes too long.  If we store everything in VB.Net arrays and use VB.Net to run the calculations, VB.Net runs out of memory (we have 2GBs of RAM).  So, it was decided to slap everything into SQL and have SQL run our calculations.  

Below is how we are storing an Excel range into SQL Server - I'm hoping that perhaps I can improve how we get the data into SQL without compromising performance.

    Sub InsertData(ByRef oDataToSlap As ExcelData, ByVal sSQLTableName As String, ByVal sServerName As String, ByVal sDatabaseName As String, ByVal sUserID As String, ByVal sPassword As String)


        'SQL Server Connection String
        Dim sSQLConnectionString As String
        sSQLConnectionString = "Data Source=" & sServerName & ";Initial Catalog=" & sDatabaseName & ";User Id=" & sUserID & ";Password=" & sPassword & ";"

        'Bulk Copy to SQL Server
        Dim bulkCopy As SqlBulkCopy
        Try
            bulkCopy = New SqlBulkCopy(sSQLConnectionString)
            bulkCopy.DestinationTableName = sSQLTableName
        Catch ex As Exception
            Dim sExceptionMessage = ex.Message
            Dim exInner As Exception = ex.InnerException
            Dim sErrorStringInner As String = ""
            If Not (exInner Is Nothing) Then
                sErrorStringInner = exInner.Message
            End If
            g_sErrorMessage = "Error occured in InsertData function while creating the SQLBulkCopy object to insert Excel data into SQL database. MSError: " & sExceptionMessage & ". " & sErrorStringInner
            g_bErroredOut = True
            oDataToSlap.connXL.Close()
            Exit Sub
        End Try

        'Slap into SQL Database and then close connection
        Try
            bulkCopy.WriteToServer(oDataToSlap.drXL)
            bulkCopy.Close()
        Catch ex As Exception
            Dim sExceptionMessage = ex.Message
            Dim exInner As Exception = ex.InnerException
            Dim sErrorStringInner As String = ""
            If Not (exInner Is Nothing) Then
                sErrorStringInner = exInner.Message
            End If
            g_sErrorMessage = "Error occured in InsertData function while writing to database via SQLBulkCopy. MSError: " & sExceptionMessage & ". " & sErrorStringInner
            g_bErroredOut = True
            oDataToSlap.connXL.Close()
            Exit Sub
        End Try

        'Close excel connection
        oDataToSlap.connXL.Close()

    End Sub
0
 
rescapacctgitAuthor Commented:
Attached is the table Definition.

It's about 50 tables.

All the tables that begin with Input are about 1600 rows per run id (not too bad)

The huge tables are the ones the ones that begin with MAIN (1600 rows per month per runid)
Even worse are SPPSlap, SPIASlap and SPOSlap, these are 1600 rows per month per SPID (we have 90 SPIDs) per run

Thanks for looking!

tblDef.txt
0
 
rescapacctgitAuthor Commented:
thanks acperkins ..... not what I was hoping for but it confirms some of my suspicions.  
0
 
Scott PletcherSenior DBACommented:
If you have BIDS, it's fairly easy to design and create a package that can adjust the data types as you need before INSERTing into the db.
0
 
rescapacctgitAuthor Commented:
BIDS?
0
 
Scott PletcherSenior DBACommented:
Sorry, Bus Intel Design Studio.

It's the GUI for SSIS package development.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.