We help IT Professionals succeed at work.

Handling VERY large arrays and using LOTS of memory!

jscharpf
jscharpf asked
on
310 Views
Last Modified: 2010-05-01
Well,
I am trying to deal with huge arrays (Multidimension.. several arrays of (8,8000000)) yes that's 8 by 8 million lol.. and these are single precision numbers..
I'm declaring these all in the beginning in the declarations of my module (these aren't dynamic).

Up to about 8 million I have no problems other than slowing my system down, and I'm ok with that.
At some point, if I keep increasing the array size, eventually I'll get an "out of memory" error when attempting to run VB.. I understand this.. (I have about 1 gig of ram)
My question is .. is there a better way to handle this much data? If I use dynamic arrays will this help ? ( I still need this much data regardless)..
Why doesn't windows automatically swap out data if it exceeds the ram (Why am i getting out of memory errors at all? should I increase my swap file size?)
I'm ok with the large amounts of data slowing my system down. I'm just wondering what the best way to handle these very large arrays in VB is..

I'm using VB 6.0, Windows XP Pro. I think it's 1G ram.. not sure of the processor but it's brand new..


Thanks

Jeff
Comment
Watch Question

CERTIFIED EXPERT

Commented:
XP is very slow.  Have you put all of the WIndows Updates on it yet?

Author

Commented:
yes
I'm not worried about how slow it is. I'm worried about how much I can push the memory limit. Isn't the point of a swap file to dump excess data when memory is low?

Author

Commented:
I just wanted to add.
Is there some limit on array size for VB? I am currently working with 8 by 6 million.. without problems.. but when I go higher I get out of memory errors.. Will extra RAM help (if so, why doesn't windows just swap it out?).. or is there a limit in VB?

CERTIFIED EXPERT

Commented:
How are you using these arrays, randomly or sequentially?

I suspect that each array would need about 244MB of data, so you need more ram or......

You can get 2GB USB pen drives for not much money now.  You could create a virtual array class where the array elements are paged to/from disk.  You could read sequential all elements in just a few seconds.  I suspect that USB2 pen drivers would be very fast.  You also have an option of using memory sticks.
Hi,

Each of your 8x8,000,000 arrays takes about 250 MB of memory. I don't know what limits VB imposes, but windows imposes a limit of 2 GB of addressable memory per application. You imply that you have more than one of these arrays. If you have more than 8, you will max out on the addressable memory space, in which case, while there are workarounds, they are incredibly difficult to implement, and I doubt that you can do it in VB. If the 2 GB limit is not what you are hitting, maybe VB imposes a more restrictive limit. There may also be a limit on the size of an array.

Zaphod.

Author

Commented:
ok so it just sounds like I need more RAM.. I wouldn't know how to do any paging from disk.. That might be too advanced..
CERTIFIED EXPERT

Commented:
If you use the task manager, performance tab, you can start with your array size set to a lower level and increase in stages.  When available memory is zero windows starts to swap and slow down big-time.

If you save your arrays in a disk file your program and windows will be optimized and will not waste time dumping and retrieving the same data over and over again.  

You can even make you code look almost the same:

Dim ArrayName(x,y)

result = ArrayName(x,y)

Becomes:

Function ArrayName(x, y)
"to do" need read data from disk code
ArrayName(x, y) = res
End Function
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Commented:
jscharpf,

Does the array really need to be that large? Is there data in EVERY cell of the array? Or are many of the cells in the array empty?

If the array is largely empty (ie most data is the same or empty), you could do something as simple as use a Collection, with the Key as CStr(RowNumber) + "|" + CStr(ColumnNumber).

Seeya
Matthew
CERTIFIED EXPERT

Commented:
Furthermore, if you want to go that way I would suggest that you turn the vertual arry into a class.

It was able to read/write 1,000,000 elements in less than 12 seconds.

Here is a test in a form load

Option Explicit

Dim A As New zVirtualArray

Private Sub Form_Load()

' setup the virtual array

A.FileName = "C:\test1.txt"
A.YElements = 999
A.XElements = 999

Dim X As Long
Dim Y As Long
Dim lc As Single
Dim sngStart As Single

sngStart = Timer

' place some stuff into an array
lc = 0
For X = 0 To A.XElements
    For Y = 0 To A.YElements
        A(X, Y) = lc
        lc = lc + 1
    Next Y
Next X

' check that it is still there
lc = 0
For X = 0 To A.XElements
    For Y = 0 To A.YElements
        If lc <> A(X, Y) Then
            MsgBox "Bug found"
        End If
        lc = lc + 1
    Next Y
Next X

MsgBox "All done in " + Format(Timer - sngStart, "0.000") + " " + Format(lc, "0,000") + " records checked"

End Sub





'-----------------------------zVirtualArray.cls
Option Explicit

' Warning:

' for this class to work you must do this:
' Tools, proecdure attributes, select Value, Advanced,
' change the "Procedure ID" to "Default", then Apply



Public FileName As String

Public YElements As Long
Public XElements As Long

Dim mbOpen As Boolean
Dim mlLFN As Long
Dim mlSize As Long

Private Sub Class_Initialize()
Dim sngDummy As Single
sngDummy = 1
mlSize = Len(sngDummy)
End Sub

Private Sub Class_Terminate()
If mbOpen Then
    Close mlLFN
End If
End Sub

Public Property Get Value(X As Long, Y As Long) As Single
Static Result As Single
Static lPos As Long
If Not mbOpen Then
    zOpenFile
End If
Get mlLFN, zPos(X, Y), Result
Value = Result
End Property

Public Property Let Value(X As Long, Y As Long, NewValue As Single)
Static lPos As Long
If Not mbOpen Then
    zOpenFile
End If
Put mlLFN, zPos(X, Y), NewValue
End Property

Private Sub zOpenFile()
mbOpen = True
mlLFN = FreeFile
Open FileName For Binary Access Read Write Shared As #mlLFN
End Sub

Private Function zPos(X As Long, Y As Long) As Long
zPos = (X * (YElements + 1) + Y) * mlSize + 1
End Function



CERTIFIED EXPERT

Commented:
You could improve the speed if we knew that you work making sequential access thru the array.

Author

Commented:
Thanks for all the help so far. I will have to read through this all thoroughly. It seems rather complicated lol..
The array is initially empty, but I wanted to reserve the space so I have the capability of filling it..
When I run the program, I load data from a text file (this is how the data is presented and I can't change that).. Sometimes the data doesn't even come close to filling the array, other times it overfills the array.. (I get subset out of range or something like that).. In the past I simply increased my array size.. but I'm trying to find out if there is some point where I simply can't do this anymore.
 I was worried that by using a dynamic array,  I may end up crashing.. not sure..
so I've been increasing the array (in the declarations) size little by little.. because these text files have been growing.

There are many arrays. here's just a small section of my declarations..:

______________________________________________
Public cycleData(8) As Long  'number of datapoints
Public ChargeTime(8, 5000) As Single  'retrieved charge time for each station, cycle
Public ChargeTemp(8, 5000) As Single
Public DischargeTime(8, 5000) As Single
Public DischargeTemp(8, 5000) As Single
Public DischargeCapacity(8, 5000) As Single
'word library declarations
Public NewDoc As Word.Application
Public NextDoc As Word.Application
Public Discharge25TimeData(8, 300000) As Single 'arrays to store the 25th cycle data
Public Charge25TimeData(8, 300000) As Single
Public Discharge25CycleData(8, 300000) As Single
Public Charge25CycleData(8, 300000) As Single
Public Discharge25VoltageData(8, 300000) As Single
Public Charge25VoltageData(8, 300000) As Single
Public Discharge25TempData(8, 300000) As Single
Public Charge25TempData(8, 300000) As Single
Public Discharge25CurrentData(8, 300000) As Single
Public Charge25CurrentData(8, 300000) As Single
Public Charge25Count(8) As Long  'stores the total number of records for each station 25x
Public Discharge25Count(8) As Long
Public Finalcycle(8) As Integer  'store the nth number for reference (25, 50, etc.)
Public Total25Cycles(8) As Integer 'store the total number of 25X cycles recorded..
Public stri(19) As String
Public countcell1 As Single
Public countcell2 As Single
Public countcell3 As Single
Public countcell4 As Single
Public countcell5 As Single
Public celldata(7, 6000000) As Single
Public tempdata(7, 6000000) As Single
Public countdata() As Single
Public passedcellData() As Single
Public timeData(6000000) As Single
Public cycleDataCell(6000000) As Long
Public CycleNum As Long
Public cycleCount As Long
Public cycleStart As Long
Public cycleEnd As Long
Public CapacityAt24(8) As Single
Public CapacityNow(8) As Single
______________________________________

I'm not exactly a proficient programmer.. I just keep increasing the size of these arrays :)
The data that is loaded in eventually gets graphed.. I have a graphing OCX that I bought that is made to handle an unlimited number of points.. The data is grouped in sections of about 15000 points each..But that's not my issue.. it's just finding the limits given my amount of RAM (1 G)

So.. I will look through the answers and try some suggestions.. I appreciate the help!
I will try to get a result soon!

jeff

Commented:
jscharpf,
    Is it possible for you to read in a few lines, graph the matching points, drop the data and read in the next few,  looping this way to the end of your file? That way you get rid of the huge memory demand.

Dang123

CERTIFIED EXPERT

Commented:
What is the objective of the aplication?
If speed is not  your concern why not write the data to a file and then pull the data from the file when needed.

Author

Commented:
wow so many suggestions.. :)...
The objective of the application is to view large quantities of test data. There are millions of data points and the data is viewed in "cycles".. The user needs to see all of the data, but there is so much that he/she can't see it all at once..so even though I store it all, I only show him portions at any one time..
The only problem with keeping the data in a file is that I'm rather ignorant when it comes to file access. I know sequential and that's it..the data is grouped in "cycles".. Cycle #1 will have XX data points, etc.. The user selects the cycle to view, or just clicks and moves through cycle by cycle. Each click redraws a graph of thousands of points representing one cycle. There can be thousands of cycles..
Having it in memory (for me) was easier to work with. If I understood file I/O then certainly I could have the specific data for each "cycle" brought in from this huge text file as needed. I just don't know how long it would take to parse through a 1 gig text file to find a set of points somewhere in the middle.
Currently it takes about 7  minutes to load the large file in. but once it's loaded (assuming I don't run out of memory) then it goes from cycle to cycle fairly quickly...
I'm sorry for the slow response but I am tied up in other things today.. I want to get back to this program soon.. :(
Jeff

CERTIFIED EXPERT

Commented:
In QA with large volumnes of data it is sometimes best to store all of the points into a database and use stats to produce trends but most importantly exceptions to the trends.

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.