Array() and Split() performance comparison

aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT
Published:

Introduction

On a recent question, I supplied a simple function to format a number with a suffix, denoting its scale (MB, GB, TB, etc.).  While the function was simple enough, I wondered if I had supplied the most efficient code.  This article compares the performance of the Array() and Split() functions.


The Problem

The question ( http:Q_28494315.html) stated a need to add the scale suffix to a number.  Excel did not provide the ability to format cell values this way and conditional formatting does not affect the cell's value.  The cell still contains a numeric value, while Excel displays the formatted value.


Example:


123456 = 123.456 KB
12345600 = 12.3456 MB
1234560000 = 1.23456 GB


The Accepted and Alternative Solutions

Since there were only three possible suffixes, the conditional formatting solution was the simplest and the one accepted as the question solution.

[>999999999999]0.00,,,,"TB";[>999999999]0.00,,,"GB";0.00,,"MB"
                      

Open in new window

Starting with Excel 2007, it is possible to use the conditional formatting wizard (see below) to create a conditional formats with more than three conditions.  I am writing a separate article on this.
ConditionalFormattingRulesMgr.jpg


The User-Defined Function Solutions

Let's start with a routine that uses an Array() function, most closely resembling the question's solution.  If the magnitude of the parameter value is greater than 1000, we use the base1000 log of the number to index into the Array() values.


SizeA -- An Array() function


Public Function SizeA(ByVal parmValue)
                          Static lngPower As Long
                          If Abs(parmValue) < 1000 Then
                              SizeA = parmValue
                          Else
                              lngPower = Int(Log(parmValue) / Log(1000))
                              SizeA = parmValue / (1000 ^ lngPower) & " " & Array("KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")(lngPower - 1)
                          End If
                      End Function
                      

Open in new window


SizeB -- A Split() function

We've replaced the Array() function with some string constants we use in the Split function.

Public Function SizeB(ByVal parmValue)
                          Static lngPower As Long
                          Const strSuffix As String = "KB^MB^GB^TB^PB^EB^ZB^YB"
                          Const strDelim As String = "^"
                          
                          If Abs(parmValue) < 1000 Then
                              SizeB = parmValue
                          Else
                              lngPower = Int(Log(parmValue) / Log(1000))
                              SizeB = parmValue / (1000 ^ lngPower) & " " & Split(strSuffix, strDelim)(lngPower - 1)
                          End If
                      End Function
                      

Open in new window


Head-to-head Performance test

I ran simple elapsed time tests on these two routines.  Until we get to 10,000 iterations, there is no measurable difference between them.  Even at 10K iterations, there is on .02 seconds difference with the edge going to the Split() version of the routine, SizeB().


Tweaking the Performance

I decided to play with these two routines to see if I could improve their performance.  Two questions to be addressed:

  1. How much does the Array() function cost to execute inside the function?
  2. Is the use of constant variable declarations faster than string literals?


SizeAA -- Passing the Array() value

While this might be considered cheating, its superior performance does support the general rule to minimize the number of operations you perform inside a loop or routine.

Public Function SizeAA(ByVal parmValue, ByVal parmSuffixes)
                          Static lngPower As Long
                          If Abs(parmValue) < 1000 Then
                              SizeAA = parmValue
                          Else
                              lngPower = Int(Log(parmValue) / Log(1000))
                              SizeAA = parmValue / (1000 ^ (lngPower)) & " " & parmSuffixes(lngPower - 1)
                          End If
                      End Function
                      

Open in new window


SizeBB -- constants vs. string literals

I was under the impression that constants behaved like Static variables.  However, my performance figure don't reflect much of a performance difference between const strings and string literals.

Public Function SizeBB(ByVal parmValue)
                          Static lngPower As Long
                          
                          If Abs(parmValue) < 1000 Then
                              SizeBB = parmValue
                          Else
                              lngPower = Int(Log(parmValue) / Log(1000))
                              SizeBB = parmValue / (1000 ^ lngPower) & " " & Split("KB^MB^GB^TB^PB^EB^ZB^YB", "^")(lngPower - 1)
                          End If
                      End Function
                      

Open in new window


SizeBBB -- module-level constants

In this configuration, I'm eliminating any possible overhead for either const strings or string literals.

Public Function SizeBBB(ByVal parmValue)
                          Static lngPower As Long
                          
                          If Abs(parmValue) < 1000 Then
                              SizeBBB = parmValue
                          Else
                              lngPower = Int(Log(parmValue) / Log(1000))
                              SizeBBB = parmValue / (1000 ^ lngPower) & " " & Split(gstrSuffix, gstrDelim)(lngPower - 1)
                          End If
                      End Function
                      

Open in new window


SizeB2 -- static string variables

This is a technique that should be in most every function you write that involves some initialization of objects or initial setting of variable values, where the variables do not change value during the invocation of the routine.  In the Design Patterns world, this would be a singleton event in the life of the routine or class.

Public Function SizeB2(ByVal parmValue)
                          Static lngPower As Long
                          Static strSuffix As String
                          Static strDelim As String
                          If Len(strSuffix) = 0 Then
                              strSuffix = "KB^MB^GB^TB^PB^EB^ZB^YB"
                              strDelim = "^"
                          End If
                          If Abs(parmValue) < 1000 Then
                              SizeB2 = parmValue
                          Else
                              lngPower = Int(Log(parmValue) / Log(1000))
                              SizeB2 = parmValue / (1000 ^ lngPower) & " " & Split(strSuffix, strDelim)(lngPower - 1)
                          End If
                      End Function
                      

Open in new window

 

Tweaked performance comparisons -- multiple 10k runs


MIN     	MAX     	AVERAGE 	ROUTINE
                      0.0859375	0.09375 	0.08828125	SizeA
                      0.0546875	0.0625  	0.06171875	SizeAA
                      0.0625  	0.0703125	0.06796875	SizeB
                      0.0625  	0.078125	0.06875 	SizeB2
                      0.0625  	0.078125	0.0703125	SizeBB
                      0.0625   	0.078125	0.06953125	SizeBBB
                      

Open in new window


Conclusions


  • The Split() function is a better performer than the Array() function.
  • If you can pass an array into your function, regardless of whether the result of an Array() or Split() function invocation, you will get the best performance.
  • String literals, global constants, local constants, and static (initialize once) variables perform better than module-level constants.
 

The Timing Code

To help you evaluate these different routines in your own run-time environment, here is the code I used to do the timing.

Option Explicit
                      
                      Const gstrSuffix As String = "KB^MB^GB^TB^PB^EB^ZB^YB"
                      Const gstrDelim As String = "^"
                      
                      
                      Public Sub timeit()
                          Dim sngStart As Single
                          Dim lngLoop As Long
                          Dim strResult As String
                          Const lngCount As Long = 100000
                          Dim vSuffixes As Variant
                          vSuffixes = Array("KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
                          
                          sngStart = Timer
                          For lngLoop = 1 To lngCount
                              strResult = SizeA(123456 + lngLoop)
                          Next
                          Debug.Print "SizeA", Timer - sngStart, strResult
                          
                          sngStart = Timer
                          For lngLoop = 1 To lngCount
                              strResult = SizeAA(123456 + lngLoop, vSuffixes)
                          Next
                          Debug.Print "SizeAA", Timer - sngStart, strResult
                          
                          
                          sngStart = Timer
                          For lngLoop = 1 To lngCount
                              strResult = SizeB(123456 + lngLoop)
                          Next
                          Debug.Print "SizeB", Timer - sngStart, strResult
                          
                          sngStart = Timer
                          For lngLoop = 1 To lngCount
                              strResult = SizeBB(123456 + lngLoop)
                          Next
                          Debug.Print "SizeBB", Timer - sngStart, strResult
                          
                          sngStart = Timer
                          For lngLoop = 1 To lngCount
                              strResult = SizeBBB(123456 + lngLoop)
                          Next
                          Debug.Print "SizeBBB", Timer - sngStart, strResult
                          
                          sngStart = Timer
                          For lngLoop = 1 To lngCount
                              strResult = SizeB2(123456 + lngLoop)
                          Next
                          Debug.Print "SizeB2", Timer - sngStart, strResult
                      End Sub
                      

Open in new window

Here is a bas file you can import into your VBproject environment.
A-14860.bas

 =-=-=-=-=-=-=-=-=-=-=-=-= -=-=-=-=-= -=-=-=-=-= -=-=-=-=-= -=-=-=
If you liked this article and want to see more from this author,   please click here.
 
If you found this article helpful, please click the Yes link near the:
 
      Was this article helpful?
 
label that is just below and to the right of this text.   Thanks!
=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=
1
3,309 Views
aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.