Link to home
Start Free TrialLog in
Avatar of jscharpf
jscharpf

asked on

Handling VERY large arrays and using LOTS of memory!

Well,
I am trying to deal with huge arrays (Multidimension.. several arrays of (8,8000000)) yes that's 8 by 8 million lol.. and these are single precision numbers..
I'm declaring these all in the beginning in the declarations of my module (these aren't dynamic).

Up to about 8 million I have no problems other than slowing my system down, and I'm ok with that.
At some point, if I keep increasing the array size, eventually I'll get an "out of memory" error when attempting to run VB.. I understand this.. (I have about 1 gig of ram)
My question is .. is there a better way to handle this much data? If I use dynamic arrays will this help ? ( I still need this much data regardless)..
Why doesn't windows automatically swap out data if it exceeds the ram (Why am i getting out of memory errors at all? should I increase my swap file size?)
I'm ok with the large amounts of data slowing my system down. I'm just wondering what the best way to handle these very large arrays in VB is..

I'm using VB 6.0, Windows XP Pro. I think it's 1G ram.. not sure of the processor but it's brand new..


Thanks

Jeff
Avatar of inthedark
inthedark
Flag of United Kingdom of Great Britain and Northern Ireland image

XP is very slow.  Have you put all of the WIndows Updates on it yet?
Avatar of jscharpf
jscharpf

ASKER

yes
I'm not worried about how slow it is. I'm worried about how much I can push the memory limit. Isn't the point of a swap file to dump excess data when memory is low?
I just wanted to add.
Is there some limit on array size for VB? I am currently working with 8 by 6 million.. without problems.. but when I go higher I get out of memory errors.. Will extra RAM help (if so, why doesn't windows just swap it out?).. or is there a limit in VB?

How are you using these arrays, randomly or sequentially?

I suspect that each array would need about 244MB of data, so you need more ram or......

You can get 2GB USB pen drives for not much money now.  You could create a virtual array class where the array elements are paged to/from disk.  You could read sequential all elements in just a few seconds.  I suspect that USB2 pen drivers would be very fast.  You also have an option of using memory sticks.
Hi,

Each of your 8x8,000,000 arrays takes about 250 MB of memory. I don't know what limits VB imposes, but windows imposes a limit of 2 GB of addressable memory per application. You imply that you have more than one of these arrays. If you have more than 8, you will max out on the addressable memory space, in which case, while there are workarounds, they are incredibly difficult to implement, and I doubt that you can do it in VB. If the 2 GB limit is not what you are hitting, maybe VB imposes a more restrictive limit. There may also be a limit on the size of an array.

Zaphod.
ok so it just sounds like I need more RAM.. I wouldn't know how to do any paging from disk.. That might be too advanced..
If you use the task manager, performance tab, you can start with your array size set to a lower level and increase in stages.  When available memory is zero windows starts to swap and slow down big-time.

If you save your arrays in a disk file your program and windows will be optimized and will not waste time dumping and retrieving the same data over and over again.  

You can even make you code look almost the same:

Dim ArrayName(x,y)

result = ArrayName(x,y)

Becomes:

Function ArrayName(x, y)
"to do" need read data from disk code
ArrayName(x, y) = res
End Function
ASKER CERTIFIED SOLUTION
Avatar of inthedark
inthedark
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
jscharpf,

Does the array really need to be that large? Is there data in EVERY cell of the array? Or are many of the cells in the array empty?

If the array is largely empty (ie most data is the same or empty), you could do something as simple as use a Collection, with the Key as CStr(RowNumber) + "|" + CStr(ColumnNumber).

Seeya
Matthew
Furthermore, if you want to go that way I would suggest that you turn the vertual arry into a class.

It was able to read/write 1,000,000 elements in less than 12 seconds.

Here is a test in a form load

Option Explicit

Dim A As New zVirtualArray

Private Sub Form_Load()

' setup the virtual array

A.FileName = "C:\test1.txt"
A.YElements = 999
A.XElements = 999

Dim X As Long
Dim Y As Long
Dim lc As Single
Dim sngStart As Single

sngStart = Timer

' place some stuff into an array
lc = 0
For X = 0 To A.XElements
    For Y = 0 To A.YElements
        A(X, Y) = lc
        lc = lc + 1
    Next Y
Next X

' check that it is still there
lc = 0
For X = 0 To A.XElements
    For Y = 0 To A.YElements
        If lc <> A(X, Y) Then
            MsgBox "Bug found"
        End If
        lc = lc + 1
    Next Y
Next X

MsgBox "All done in " + Format(Timer - sngStart, "0.000") + " " + Format(lc, "0,000") + " records checked"

End Sub





'-----------------------------zVirtualArray.cls
Option Explicit

' Warning:

' for this class to work you must do this:
' Tools, proecdure attributes, select Value, Advanced,
' change the "Procedure ID" to "Default", then Apply



Public FileName As String

Public YElements As Long
Public XElements As Long

Dim mbOpen As Boolean
Dim mlLFN As Long
Dim mlSize As Long

Private Sub Class_Initialize()
Dim sngDummy As Single
sngDummy = 1
mlSize = Len(sngDummy)
End Sub

Private Sub Class_Terminate()
If mbOpen Then
    Close mlLFN
End If
End Sub

Public Property Get Value(X As Long, Y As Long) As Single
Static Result As Single
Static lPos As Long
If Not mbOpen Then
    zOpenFile
End If
Get mlLFN, zPos(X, Y), Result
Value = Result
End Property

Public Property Let Value(X As Long, Y As Long, NewValue As Single)
Static lPos As Long
If Not mbOpen Then
    zOpenFile
End If
Put mlLFN, zPos(X, Y), NewValue
End Property

Private Sub zOpenFile()
mbOpen = True
mlLFN = FreeFile
Open FileName For Binary Access Read Write Shared As #mlLFN
End Sub

Private Function zPos(X As Long, Y As Long) As Long
zPos = (X * (YElements + 1) + Y) * mlSize + 1
End Function



You could improve the speed if we knew that you work making sequential access thru the array.
Thanks for all the help so far. I will have to read through this all thoroughly. It seems rather complicated lol..
The array is initially empty, but I wanted to reserve the space so I have the capability of filling it..
When I run the program, I load data from a text file (this is how the data is presented and I can't change that).. Sometimes the data doesn't even come close to filling the array, other times it overfills the array.. (I get subset out of range or something like that).. In the past I simply increased my array size.. but I'm trying to find out if there is some point where I simply can't do this anymore.
 I was worried that by using a dynamic array,  I may end up crashing.. not sure..
so I've been increasing the array (in the declarations) size little by little.. because these text files have been growing.

There are many arrays. here's just a small section of my declarations..:

______________________________________________
Public cycleData(8) As Long  'number of datapoints
Public ChargeTime(8, 5000) As Single  'retrieved charge time for each station, cycle
Public ChargeTemp(8, 5000) As Single
Public DischargeTime(8, 5000) As Single
Public DischargeTemp(8, 5000) As Single
Public DischargeCapacity(8, 5000) As Single
'word library declarations
Public NewDoc As Word.Application
Public NextDoc As Word.Application
Public Discharge25TimeData(8, 300000) As Single 'arrays to store the 25th cycle data
Public Charge25TimeData(8, 300000) As Single
Public Discharge25CycleData(8, 300000) As Single
Public Charge25CycleData(8, 300000) As Single
Public Discharge25VoltageData(8, 300000) As Single
Public Charge25VoltageData(8, 300000) As Single
Public Discharge25TempData(8, 300000) As Single
Public Charge25TempData(8, 300000) As Single
Public Discharge25CurrentData(8, 300000) As Single
Public Charge25CurrentData(8, 300000) As Single
Public Charge25Count(8) As Long  'stores the total number of records for each station 25x
Public Discharge25Count(8) As Long
Public Finalcycle(8) As Integer  'store the nth number for reference (25, 50, etc.)
Public Total25Cycles(8) As Integer 'store the total number of 25X cycles recorded..
Public stri(19) As String
Public countcell1 As Single
Public countcell2 As Single
Public countcell3 As Single
Public countcell4 As Single
Public countcell5 As Single
Public celldata(7, 6000000) As Single
Public tempdata(7, 6000000) As Single
Public countdata() As Single
Public passedcellData() As Single
Public timeData(6000000) As Single
Public cycleDataCell(6000000) As Long
Public CycleNum As Long
Public cycleCount As Long
Public cycleStart As Long
Public cycleEnd As Long
Public CapacityAt24(8) As Single
Public CapacityNow(8) As Single
______________________________________

I'm not exactly a proficient programmer.. I just keep increasing the size of these arrays :)
The data that is loaded in eventually gets graphed.. I have a graphing OCX that I bought that is made to handle an unlimited number of points.. The data is grouped in sections of about 15000 points each..But that's not my issue.. it's just finding the limits given my amount of RAM (1 G)

So.. I will look through the answers and try some suggestions.. I appreciate the help!
I will try to get a result soon!

jeff

jscharpf,
    Is it possible for you to read in a few lines, graph the matching points, drop the data and read in the next few,  looping this way to the end of your file? That way you get rid of the huge memory demand.

Dang123

What is the objective of the aplication?
If speed is not  your concern why not write the data to a file and then pull the data from the file when needed.
wow so many suggestions.. :)...
The objective of the application is to view large quantities of test data. There are millions of data points and the data is viewed in "cycles".. The user needs to see all of the data, but there is so much that he/she can't see it all at once..so even though I store it all, I only show him portions at any one time..
The only problem with keeping the data in a file is that I'm rather ignorant when it comes to file access. I know sequential and that's it..the data is grouped in "cycles".. Cycle #1 will have XX data points, etc.. The user selects the cycle to view, or just clicks and moves through cycle by cycle. Each click redraws a graph of thousands of points representing one cycle. There can be thousands of cycles..
Having it in memory (for me) was easier to work with. If I understood file I/O then certainly I could have the specific data for each "cycle" brought in from this huge text file as needed. I just don't know how long it would take to parse through a 1 gig text file to find a set of points somewhere in the middle.
Currently it takes about 7  minutes to load the large file in. but once it's loaded (assuming I don't run out of memory) then it goes from cycle to cycle fairly quickly...
I'm sorry for the slow response but I am tied up in other things today.. I want to get back to this program soon.. :(
Jeff

In QA with large volumnes of data it is sometimes best to store all of the points into a database and use stats to produce trends but most importantly exceptions to the trends.