System Out of Memory Error

I have been running the following code for some time but just ran into the System Out of Memory error when working with very large files. This data set has 9,707,227 records which are separated into 12 strata. The strata files are saved in an arraylist and then serialized. The error occurs on the second, third or fourth stratum. Is there a way to delete files to avoid this problem?
Try
                    Dim saveFile As FileStream
                    saveFile = File.OpenWrite(FName)
                    saveFile.Seek(0, SeekOrigin.End)
                    Bformatter.Serialize(saveFile, ArrayListArray(I))
                    saveFile.Close()
                    gMsg += "STRAT" & (I + 1).ToString & " " & FileDateTime(FName) & vbCrLf
                    If I = 0 Then
                        AuditInfo.FirstBinaryDateTimeStamp = FileDateTime(FName)
                    ElseIf I = L - 1 Then
                        AuditInfo.LastBinaryDateTimeStamp = FileDateTime(FName)
                    End If
                Catch ex As Exception
                    MsgBox("The following error occurred building the binary files: " & vbCrLf & ex.Message, MsgBoxStyle.OkOnly, "Binary File Build Error")
                    Exit Sub
                End Try
            Next I

Open in new window

LVL 1
rkulpAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

blandyukCommented:
There is missing code in the above, where is the start of the "For Next" loop? Also, what is "L"? The reason your running out of memory is because one of your variables has too much data in it. You need to debug and check the size of them.
0
rkulpAuthor Commented:
blandvuk,

Thanks for pointing out my omission. L contains the number of strata. In this case, L = 12. The array lists can be very large in this case as they contain all the data in 9.7 million records, each one about 60 characters long.  This cannot be forecast and cannot be changed. What I need is a strategy for removing files and controlling the size of the stack. Each element in ArrayListArray is an ArrayList. Unfortunately, I cannot remove elements as the ArrayList.RemoveAt method does not work for ArrayListArray. I was hoping to work backwards and remove the last ArrayList, resize the array and clean up somehow.
            gMsg = "Binary files timestamps: " & vbCrLf
            For I = 0 to L-1
                If gDataPath.EndsWith("\") Then
                    FName = gDataPath & "STRAT" & (I + 1).ToString & ".bin"
                Else
                    FName = gDataPath & "\STRAT" & (I + 1).ToString & ".bin"
                End If
                If My.Computer.FileSystem.FileExists(FName) Then Kill(FName)
                Try
                    Dim saveFile As FileStream
                    saveFile = File.OpenWrite(FName)
                    saveFile.Seek(0, SeekOrigin.End)
                    Bformatter.Serialize(saveFile, ArrayListArray(I))
                    saveFile.Close()
                    

                    gMsg += "STRAT" & (I + 1).ToString & " " & FileDateTime(FName) & vbCrLf
                    If I = 0 Then
                        AuditInfo.FirstBinaryDateTimeStamp = FileDateTime(FName)
                    ElseIf I = L - 1 Then
                        AuditInfo.LastBinaryDateTimeStamp = FileDateTime(FName)
                    End If
                Catch ex As Exception
                    MsgBox("The following error occurred building the binary file for stratum " + (I + 1).ToString + ": " & vbCrLf & ex.Message)
                    Exit Sub
                End Try
            Next I
            MsgBox(gMsg)

Open in new window

0
blandyukCommented:
Once "ArrayListArray(I)" has been used, you can just set it to nothing:
ArrayListArray(I) = Nothing
This will clear it from memory.

If each array has 9.7 million records of 60 chars, that's 512MB of data, and if you have more than 2, the Out of Memory issue will occur. As you said, you have 12, which is too much. You need to think of another way of dealing with your data so it doesn't use too much variable space, (RAM). In these arrays, do you need to open all the data before writing it? Can you deal with one at a time? Have you thought about using compression if you data is plain-text?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

blandyukCommented:
We really need to know what the data in these arrays actually DO. What is changed? Is anything compared from the other arrays? etc, etc... Examples would be useful.
0
rkulpAuthor Commented:
The data are invoices which are stratified according to invoice amount. The total population size for this audit is 9.7 million which is spread out over 12 strata. The values are sorted and then written to disk in the form of a binary serialization. The structure that is serialized contains the original data plus two small fields, record number and stratum number (data = 45 characters, record number = 10 characters and stratum number = 2 characters for a total of 57).  The strata are then randomly sampled. I will try setting ArrayListArray(I) to nothing to see what happens.
0
blandyukCommented:
Maybe you can split the files into 1 million record parts? You should be able to do this while building a cache, once it hits 1 million, flush data to file and remove from cache, then carry on reading until EOF. The cache size would never get too large, regardless of the amount of data. I actually do this for one of my projects :) and it has over 7 billion records, and there are 3 projects of this.
0
rkulpAuthor Commented:
I can't really split it up that way since I must simultaneously fill the L strata from data that is not sorted by invoice amount.
0
rkulpAuthor Commented:
I had to force garbage collection to clear the memory. Great solution and easy to implement.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.