Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Sorting an Array with over 2 million members -  Analyzing on Excel VBA

Posted on 2011-09-16
6
Medium Priority
?
457 Views
Last Modified: 2012-05-12
I have a vba code which calculates over 2 million numbers and puts them into an array. I want to sort these values inside the array. As far as I can see there is no sort function in excel vba.  
I tried Qsortinplace which can be found http://www.cpearson.com/excel/SortingArrays.aspx .

But it seems it doesnt work when there are 2 million members inside the array.

I guess something can be arranged when filling the array in the first place.

What is the best way to sort the huge arrays?
0
Comment
Question by:awesomejohn19
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
6 Comments
 
LVL 42

Expert Comment

by:dlmille
ID: 36552382
Long datatype will support array indexes up to  2,147,483,647.  Will this suffice?

Here's a heapsort algorithm using long (I personally use QuickSort, but wanted something documented with long array indexes, so here it is - untested by me):

http://www.source-code.biz/snippets/vbasic/1.htm

If not, then it can be done with a collection and collection sort.  I can assist with this, but first await your response to the first question, above.

Cheers,

Dave

0
 
LVL 42

Expert Comment

by:dlmille
ID: 36552420
Actually a variant array may be larger (can't find my reference on that).  Let me see if I can load a variant array with 3 million records using variant arrays and my quicksort algorithm...

Dave
0
 
LVL 42

Expert Comment

by:dlmille
ID: 36552698
Here's a QuickSort macro I use all the time, I only changed integer to variant.  It SHOULD work, and right now I'm trying to figure out how to load an array with > 1MM records without waiting forever.

Give it and the heapsort a shot, as you're already in a position to test.  Note usage on the Qsort...

Call QSort(myArray, LBound(myArray), UBound(myArray))

Will repaint the variant array myArray in sorted order.  Its easy enough to add a boolean in the mix to determine ascending/descending and I can help with that if you like it.  right now its ascending.

Let me know if this works for you:

 
Sub QSort(sortArray As Variant, ByVal leftIndex As Integer, ByVal rightIndex As Integer)
    Dim compValue As Variant
    Dim i As Variant
    Dim j As Variant
    Dim tempVar As Variant

    i = leftIndex
    j = rightIndex
    
    compValue = sortArray(Int((i + j) / 2))

    Do
        Do While (sortArray(i) < compValue And i < rightIndex)
            i = i + 1
        Loop
        Do While (compValue < sortArray(j) And j > leftIndex)
            j = j - 1
        Loop
        If i <= j Then
        
            tempVar = sortArray(i)
            sortArray(i) = sortArray(j)
            sortArray(j) = tempVar
            
            i = i + 1
            j = j - 1
        End If
    Loop While i <= j

    If leftIndex < j Then QSort sortArray, leftIndex, j
    If i < rightIndex Then QSort sortArray, i, rightIndex
End Sub

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 42

Expert Comment

by:dlmille
ID: 36552700
My bad.  I had one type-o in my integer-> variant conversion.

Sorry for the SPAM!

Here's a QuickSort macro I use all the time, I only changed integer to variant.  It SHOULD work, and right now I'm trying to figure out how to load an array with > 1MM records without waiting forever.

Give it and the heapsort a shot, as you're already in a position to test.  Note usage on the Qsort...

Call QSort(myArray, LBound(myArray), UBound(myArray))

Will repaint the variant array myArray in sorted order.  Its easy enough to add a boolean in the mix to determine ascending/descending and I can help with that if you like it.  right now its ascending.

Let me know if this works for you:

 
Sub QSort(sortArray As Variant, ByVal leftIndex As variant, ByVal rightIndex As variant)
    Dim compValue As Variant
    Dim i As Variant
    Dim j As Variant
    Dim tempVar As Variant

    i = leftIndex
    j = rightIndex
    
    compValue = sortArray(Int((i + j) / 2))

    Do
        Do While (sortArray(i) < compValue And i < rightIndex)
            i = i + 1
        Loop
        Do While (compValue < sortArray(j) And j > leftIndex)
            j = j - 1
        Loop
        If i <= j Then
        
            tempVar = sortArray(i)
            sortArray(i) = sortArray(j)
            sortArray(j) = tempVar
            
            i = i + 1
            j = j - 1
        End If
    Loop While i <= j

    If leftIndex < j Then QSort sortArray, leftIndex, j
    If i < rightIndex Then QSort sortArray, i, rightIndex
End Sub

Open in new window

0
 
LVL 49

Expert Comment

by:Martin Liss
ID: 36561758
I think that the thing to do is to not sort 2,000,000 records but rather to create the 'array' sorted as you build it. I have 'array' in single quotes because what I suggest is that you use the VBA Dictionary object instead. It is like a collection but faster. Here is a short tutorial.
0
 
LVL 42

Accepted Solution

by:
dlmille earned 2000 total points
ID: 36564436
There's a few approaches to pick from.

The attached workbook looks at these (credits to Andrewssd3, rorya, jan24 - as I was getting input from them on how to create a large array so I could evaluate an appropriate response).

1.  ADO Method - to populate the array using ADO, puts all the data in one dimension of the two-dimensional array.  Benefits here, include the ability to extract UNIQUE values from the dataset (3 columns of 1MM rows, each), as well as sorting incorporated in the process.

2.  BruteForce method - to populate the array with range assignments to a variant array, for 3 columns could be done with a union of the 3 ranges, or just set 3 ranges up to 3 variants, then the final variant array is loaded "brute force", element by element.  The sort approach used in the QuickSort.

3.  Qsort2d method - to populate the initial 2-D array with range assignment across all 3 columns, delivering a 2-D array with 3 columns.  Then, a Quicksort (courtesy, Andrewssd3) for 2D arrays is utilized to complete the sort.

While you may already have your array loaded, the QuickSort and/or Qsort2D might be routines that help.  I believe I gave you code for both QuickSort and HeapSort.  

QuickSort is one of the fastest (on average) sorting methods, though due to complexity, it can have issues (see http://en.wikipedia.org/wiki/Sorting_algorithm for a table of algorithms and their relative merits).  

If you still need to load your array from the workbook or other dataset, then consider the 3 approaches, above, as each has its merit.

Attached, please find these approaches in the workbook, with timestamps on execution.

Cheers to collaboration on another thread, which went well beyond ("how do you create a 1d array from 3 columns of data?") to assist in this process: http:/Q_27313095.html

I have, as yet (due to other priorities) to code and compare the HeapSort and MergeSort, but will do so even past the termination of this particular E-E post.

Let us know how your work progresses, and whether any of these options worked for you.

Cheers,

Dave
sortLargeArray-r1.xlsm
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Some code to ensure data integrity when using macros within Excel. Also included code that helps secure your data within an Excel workbook.
Access developers frequently have requirements to interact with Excel (import from or output to) in their applications.  You might be able to accomplish this with the TransferSpreadsheet and OutputTo methods, but in this series of articles I will di…
Graphs within dashboards are meant to be dynamic, representing data from a period of time that will change each time the dashboard is updated with new data. Rather than update each graph to point to a different set within a static set of data, t…
This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question