VBA Sorting 2D array twice

Hi

I've got a large dataset which will be loaded into a 2D array. Unfortunately I don't know the size of the dataset before hand so Rows and Columns are variable.

I'm using a Function called QuickSortArray which I got from

I've copied the code from the link at the bottom of this question. The dataset I'm working with is messages between servers which in captured in time order. My application needs to compare messages sent from the same server but the messages need to be in order. My idea was to sort the dataset initially on serverName and then put each block of messages sent from the same server into time order.

I've added a picture below to demonstrate. Does anyone had an idea how I should go about doing this? I can't find any pointers online. Thanks for the help

Screen-Shot-2017-11-30-at-14.55.45.png





Public Sub QuickSortArray(ByRef SortArray As Variant, Optional lngMin As Long = -1, Optional lngMax As Long = -1, Optional lngColumn As Long = 0)
    On Error Resume Next

    'Sort a 2-Dimensional array

    ' SampleUsage: sort arrData by the contents of column 3
    '
    '   QuickSortArray arrData, , , 3

    '
    'Posted by Jim Rech 10/20/98 Excel.Programming

    'Modifications, Nigel Heffernan:

    '       ' Escape failed comparison with empty variant
    '       ' Defensive coding: check inputs

    Dim i As Long
    Dim j As Long
    Dim varMid As Variant
    Dim arrRowTemp As Variant
    Dim lngColTemp As Long

    If IsEmpty(SortArray) Then
        Exit Sub
    End If
    If InStr(TypeName(SortArray), "()") < 1 Then  'IsArray() is somewhat broken: Look for brackets in the type name
        Exit Sub
    End If
    If lngMin = -1 Then
        lngMin = LBound(SortArray, 1)
    End If
    If lngMax = -1 Then
        lngMax = UBound(SortArray, 1)
    End If
    If lngMin >= lngMax Then    ' no sorting required
        Exit Sub
    End If

    i = lngMin
    j = lngMax

    varMid = Empty
    varMid = SortArray((lngMin + lngMax) \ 2, lngColumn)

    ' We  send 'Empty' and invalid data items to the end of the list:
    If IsObject(varMid) Then  ' note that we don't check isObject(SortArray(n)) - varMid *might* pick up a valid default member or property
        i = lngMax
        j = lngMin
    ElseIf IsEmpty(varMid) Then
        i = lngMax
        j = lngMin
    ElseIf IsNull(varMid) Then
        i = lngMax
        j = lngMin
    ElseIf varMid = "" Then
        i = lngMax
        j = lngMin
    ElseIf VarType(varMid) = vbError Then
        i = lngMax
        j = lngMin
    ElseIf VarType(varMid) > 17 Then
        i = lngMax
        j = lngMin
    End If

    While i <= j
        While SortArray(i, lngColumn) < varMid And i < lngMax
            i = i + 1
        Wend
        While varMid < SortArray(j, lngColumn) And j > lngMin
            j = j - 1
        Wend

        If i <= j Then
            ' Swap the rows
            ReDim arrRowTemp(LBound(SortArray, 2) To UBound(SortArray, 2))
            For lngColTemp = LBound(SortArray, 2) To UBound(SortArray, 2)
                arrRowTemp(lngColTemp) = SortArray(i, lngColTemp)
                SortArray(i, lngColTemp) = SortArray(j, lngColTemp)
                SortArray(j, lngColTemp) = arrRowTemp(lngColTemp)
            Next lngColTemp
            Erase arrRowTemp

            i = i + 1
            j = j - 1
        End If
    Wend

    If (lngMin < j) Then Call QuickSortArray(SortArray, lngMin, j, lngColumn)
    If (i < lngMax) Then Call QuickSortArray(SortArray, i, lngMax, lngColumn)

End Sub

Open in new window

PJ0302917Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Rgonzo1971Commented:
Hi,

first you should sort for the time and then server if you want the result in your last example
in reverse order of importance

Regards
0
PJ0302917Author Commented:
Thanks for the reply, I thought if QuickSortArray was run a second time, It wouldn’t maintain the time order from the first search. Il run the code and double check
0
PJ0302917Author Commented:
Hi

Just got to try out the sort code I posted earlier. The input data is already in time order (Column A). When I apply a filter on Server number and look at the output, the filter works. All Server 1's messages are listed first, but say there are 50 messages from Server 1. When you read from 1st to 50th message, the times are not in order. Any idea's? I'm abit lost what to do
0
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

Gustav BrockCIOCommented:
My idea was to sort the dataset initially on serverName and then put each block of messages sent from the same server into time order.

I would adjust the code to sort on not one but two columns in one go. So, sort on Server, then Time, both ascending.

/gustav
0
PJ0302917Author Commented:
Thanks for the suggestion /gustav

I assume the code would need to go in the final if statement which is

 If i <= j Then
            ' Swap the rows
            ReDim arrRowTemp(LBound(SortArray, 2) To UBound(SortArray, 2))
            For lngColTemp = LBound(SortArray, 2) To UBound(SortArray, 2)
                arrRowTemp(lngColTemp) = SortArray(i, lngColTemp)
                SortArray(i, lngColTemp) = SortArray(j, lngColTemp)
                SortArray(j, lngColTemp) = arrRowTemp(lngColTemp)
            Next lngColTemp
            Erase arrRowTemp

            i = i + 1
            j = j - 1
        End If

Open in new window


Specifically the last for loop? so..
arrRowTemp(lngColTemp) = SortArray(i, lngColTemp) copies the lower row (i) (below the mid value) that needs to be swapped to a temporary array
SortArray(i, lngColTemp) = SortArray(j, lngColTemp) copies the row above mid that needs to go lower (j), to overwrite the (i) row
SortArray(j, lngColTemp) = arrRowTemp(lngColTemp) copies the (i) array stored in a temp array to the (j) row

Where would you add code to sort on time? it would be nice to just check row-1 and keep checking until the time is lower or the server number changes. The problem with that isn't going to work because you need to check all the rows associated with that server number.

I'm stuck how to do it efficiently because im going to be working with large datasets
0
Gustav BrockCIOCommented:
I guess you will need a compound key using string comparison (air code):

varMid = SortArray((lngMin + lngMax) \ 2, lngColumn1) & "|" & SortArray((lngMin + lngMax) \ 2, lngColumn2)

Open in new window

then something like:

    While SortArray(i, lngColumn1) & "|" & SortArray(i, lngColumn2) < varMid And i < lngMax
        i = i + 1
    Wend
    While varMid < SortArray(j, lngColumn1) & "|" & SortArray(j, lngColumn2) And j > lngMin
        j = j - 1
    Wend

Open in new window

/gustav
0
PJ0302917Author Commented:
Thank you /gustav

Thats a hell of a lot closer and I would have never thought of that. I really appreciate the help thanks. The Server ID is being sorted correctly, the timings are being sorted but are coming out like this

1352.97
1434.15
1520.20
1611.41
2285.82
2422.97
354.20         <------
421.86
447.17
474.00
502.44
598.42

Any idea why this is happening? It doesn't seem to be taking the decimal point into consideration?
0
Gustav BrockCIOCommented:
That's because of the sorting as text.  You may have to apply a format to the values, like:

SortArray(i, lngColumn1) & "|" & Format(SortArray(i, lngColumn2), "0000.00")

Open in new window

to create strings like "0354.20"  

/gustav
0
aikimarkCommented:
You can also create a comparison function with two variant array parameters.  These two arrays are the 'key' values from the two items to be compared.  Since you are using a QuickSort algorithm, the pivot keys won't change.  Each item in the arrays have an implicit data type and don't need to be cast into formatted string values.
0
aikimarkCommented:
Example:
Function CompareVectors(parmFirst, parmSecond) As Long
    Dim lngIndex As Long
    Dim lngCompare As Long
    lngIndex = LBound(parmFirst)
    Do
        Select Case parmFirst(lngIndex)
            Case parmSecond(lngIndex)
                lngCompare = 0
            Case Is < parmSecond(lngIndex)
                lngCompare = -1
            Case Else
                lngCompare = 1
        End Select
        lngIndex = lngIndex + 1
    Loop While (lngIndex <= UBound(parmFirst)) And lngCompare = 0
    CompareVectors = lngCompare
End Function

Open in new window


Tests:
?CompareVectors(array(3,"a"), array(3, "b"))
-1 
?CompareVectors(array(3,"a"), array(3, "a"))
 0 
?CompareVectors(array(3,"c"), array(3, "b"))
 1 
?CompareVectors(array(4,"a"), array(3, "b"))
 1 

Open in new window

In this case, the CompareVectors function doesn't really care about the data type of the vector items (can be mixed), as long as the vectors are the same size.
0
PJ0302917Author Commented:
Wow, thank you for taking the time to write that code aikimark. I'll test it as soon as I'm at the computer.
0
aikimarkCommented:
You could use collection objects instead of arrays for the parameters.  In that case, you would not use the Lbound() and Ubound() functions.
0
PJ0302917Author Commented:
HI

Had chance to work on this today and have some updates

@/gustav
I couldn't get the Format as text to work well. I assume its because the server number and time are being joined together and I don't know the size of the server number before hand makes it hard to filter on the text string? I did try putting some large numbers in like 00000000.000000000 but it didn't seem to work.

@aikimark
With your solution, when you talk about vectors, are you talking about the I and j values? The quicksort code im using takes a mid point value then I is the first element in the array that is > mid point and j is the first value lower than mid point working backwards from the end of the array. If thats the case thats good, I just couldn't see how the 2nd filter was going to work?

I've found another implementation online where the 2d array is written back to a worksheet range and they used the inbuilt sort function

Call excel_range.Sort(Key1:=array_sortkeys(1), Order1:=array_sortorders(1), Key2:=array_sortkeys(2), Order2:=array_sortorders(2), Header:=tag_header, MatchCase:=tag_matchcase)

Open in new window

 

This works and I assume reading and writing large datasets to a worksheet is a big no-no
0
aikimarkCommented:
@PJ0302917

You have several choices.

The simplest solution is to sort your 2D array twice,  First, referencing the least significant column on the first pass and the most significant column on the second pass.

A more complicated solution would be to reference both columns in a multi-column comparison of two items.

You can extend the algorithm to use the comparison function I posted, or some version/variant of it.  This way, your quicksort routine doesn't have to change very much.

As you've mentioned, you can push the 2D array data into an Excel worksheet and then use the range sort method.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Gustav BrockCIOCommented:
I assume its because the server number and time are being joined together and I don't know the size of the server number before hand makes it hard to filter on the text string?

Can't tell. I have only your posted sample data.

/gustav
1
PJ0302917Author Commented:
Sorry for not getting back with additional data, plan to start work again in a few days but will start a new question if still having problems
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VBA

From novice to tech pro — start learning today.