We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you a podcast all about Citrix Workspace, moving to the cloud, and analytics & intelligence. Episode 2 coming soon!Listen Now

x

Clean a VBA script to run faster

Medium Priority
90 Views
Last Modified: 2020-05-22
Hi
The attached files are an input file that is pasted as text into A1 of an excel sheet and the script to clean  and sort the dat. The script takes th info and separates it into columns and then a further worksheet with separate devices. The script currently takes about 2 minutes to run and have been trying to clean it up to run faster but i seem to break it when i do. Could  you help with cleaning it

Thanks in advance
input-text.txt
CDP-Clean-script.txt
Comment
Watch Question

ste5anSenior Developer
CERTIFIED EXPERT

Commented:
First of all: Use Option Explicit in all your code modules. Make it the default by checking Variable declaration required in the VBA IDE under Tools/Options. Cause you have undeclared variables.

Then importing that file by code would be faster. E.g.

Option Explicit
Public Sub Test()
  ReadSearchResult "C:\Temp\Test.txt"
End Sub
Public Sub ReadSearchResult(ByVal CFileName As String)
  Dim CurrentLine As String
  Dim DeviceCount As Long
  Dim DeviceID As String
  Dim FileHandle As Long
  Dim Interface As String
  Dim IPAddress As String
  Dim Platform As String
  Dim TokenPosition As Long
  Dim Token As String
 
  DeviceCount = 0
  DeviceID = "DeviceID"
  Interface = "Interface"
  IPAddress = "IPAddress"
  Platform = "Platform"
  FileHandle = FreeFile
  Open CFileName For Input As #FileHandle
  Do While Not EOF(FileHandle)
    Input #FileHandle, CurrentLine ' Line Input required, when parsing for post or capabilites.
    Token = " Device ID: "
    TokenPosition = InStr(CurrentLine, Token)
    If TokenPosition > 0 Then
      ActiveSheet.Range("A1").Offset(DeviceCount, 0).Value = DeviceID
      ActiveSheet.Range("A1").Offset(DeviceCount, 1).Value = IPAddress
      ActiveSheet.Range("A1").Offset(DeviceCount, 2).Value = Platform
      ActiveSheet.Range("A1").Offset(DeviceCount, 3).Value = Interface
      DeviceID = Trim(Mid(CurrentLine, TokenPosition + Len(Token), 1024))
      Debug.Print DeviceID
      DeviceCount = DeviceCount + 1
      GoTo ContinueDo
    End If
   
    Token = " IP address: "
    TokenPosition = InStr(CurrentLine, Token)
    If TokenPosition > 0 Then
      IPAddress = Trim(Mid(CurrentLine, TokenPosition + Len(Token), 1024))
      Debug.Print IPAddress
      GoTo ContinueDo
    End If
   
    Token = " Platform: "
    TokenPosition = InStr(CurrentLine, Token)
    If TokenPosition > 0 Then
      Platform = Trim(Mid(CurrentLine, TokenPosition + Len(Token), 1024))
      Debug.Print Platform
      GoTo ContinueDo
    End If
   
    Token = " Interface: "
    TokenPosition = InStr(CurrentLine, Token)
    If TokenPosition > 0 Then
      Interface = Trim(Mid(CurrentLine, TokenPosition + Len(Token), 1024))
      Debug.Print Interface
      GoTo ContinueDo
    End If
   
ContinueDo:
  Loop
 
  Close #FileHandle
End Sub

Author

Commented:
Hi
Thanks for the quick response. Do i insert the above code into my module and if so where. Am i using this to replace parts of my original code?

Colin
ste5anSenior Developer
CERTIFIED EXPERT

Commented:
It's as I said a different approach. Thus copy it into a new module.

Author

Commented:
Wow, Thanks as that is super fast at collecting info from all the files but the previous script picked up more info;
It added the filenames for each row of data in the first column (Host) as well as splitting the interface output to show the local and remote interface as well.
my headers were populated as below;
Host Local Interface Remote Interface Remote Host Remote IP Remote Platform


Thanks
CERTIFIED EXPERT
Top Expert 2014

Commented:
Please post a representative sample of the raw input text.

The fastest data insertion will be a block range-at-a-time, rather than cell-at-a-time.  I've detailed this in my Fast Data Push article:
https://www.experts-exchange.com/articles/2253/Fast-Data-Push-to-Excel.html

Author

Commented:
the input.txt file is the input data collected from multiple files
John KorchokProduction Manager
CERTIFIED EXPERT

Commented:
Another technique to speed up code execution is to use With statements:
With ActiveSheet.Range("A1")
   .Offset(DeviceCount, 0).Value = DeviceID
   .Offset(DeviceCount, 1).Value = IPAddress
   .Offset(DeviceCount, 2).Value = Platform
   .Offset(DeviceCount, 3).Value = Interface
End With

CERTIFIED EXPERT
Top Expert 2014

Commented:
the input.txt file is the input data collected from multiple files
The file you posted has been processed.  I would like to see some actual lines of one of those files.

Author

Commented:
this would be an example of the lines i am trying to extract.
Each file may contain between 1 and x numbers of devices but in the same format
snippet.txt

Author

Commented:
sorry meant to say that i have been processing through notepad++ using "Device ID:|  IP address:|Platform:|Interface:" that gives me the single file output as per the input.txt file
CERTIFIED EXPERT
Top Expert 2014

Commented:
I understand.  Thanks for posting the raw data.  Did you want to transform the data within Notepad++ (for subsequent copy/paste into Excel) or continue with a VBA solution?

Author

Commented:
ideally I could point it to the folder of files and it pulls  the data, otherwise I can continue processing through notepad ++. as I have to run other scripts anyway. it’s just this one that is currently  taking over two minutes to run. not sure which way is faster
CERTIFIED EXPERT
Top Expert 2014

Commented:
"Device ID:|  IP address:|Platform:|Interface:"
You are only pulling in Notepad++ lines that meet this criteria.  Are you then copying/pasting the search results into Excel and running the CDP_Neigh_Cleaned()  routine?

I'll have to look at the code quite a bit to see what statements result in your
Host Local Interface Remote Interface Remote Host Remote IP Remote Platform
Headers/columns.  There are more columns in your output than in your Notepad++ filter.
Can you help me map which columns are related to which tagged data?
CERTIFIED EXPERT
Top Expert 2014

Commented:
It's getting late.  I'll leave this regex pattern in case some other expert wants to use regular expressions in their solution.
(Device ID|IP address|Platform|Interface|Port ID \(outgoing port\)):\s+(.+?)(?:,|\r\n)

Author

Commented:
hi akimark
yes i run the regex through notepad++ then copy/pate into A1 of a worksheet;
The columns are renamed / mapped based on below;
Host
filename
Local Interface
Interface:
Remote Interface
Port ID (outgoing port):
Remote Host
Device ID:
Remote IP
  IP address:
Remote Platform
Platform:


CERTIFIED EXPERT
Top Expert 2014

Commented:
I don't see " filename" in your sample text
ste5anSenior Developer
CERTIFIED EXPERT

Commented:
Well, what about showing us a screenshot of the current output?


I've included only some columns. Cause I've tried to run your code, but it only returned three columns with out values. So I guessed the output.

Author

Commented:
so the steps i run;
in notepad++ using find in files pointed at a directory of files i run the following regex "Device ID:|  IP address:|Platform:|Interface:". The files it runs against contain a lot of info but the part I am interested for this exercise starts at Host1#show cdp neighbor detail
This returns the output in Notepad++ results.txt The start for each new file is  C:\Temp\Host1.log (15 hits). Each file has a variable number of hits. The Host1 is the file name which relates to the file name which is the device ID populated into Column A against the results in that file. The other columns are populated by the information for the devices connected. When it gets to the next host it repeats.
This 4 devices.txt  is an example output of just four files(devices) being run through Notepadd++ and this is what it looks like once it is cleaned and run through my script in excel Cleaned output.xlsx
Columns G, H and I are just the output in column B split into three so that the list can be sorted at the end by Column A , then G, H and I

Hope that above makes it clearer
Thanks all for your input on this. 
CERTIFIED EXPERT
Top Expert 2014

Commented:
Ah. I understand the missing filename issue now.  Thanks.  The cleaned-output workbook is also helpful.

Author

Commented:
hi guys
any further input required on this?
CERTIFIED EXPERT
Top Expert 2014

Commented:
I'm coding up a solution.  Should have something to show you tomorrow
CERTIFIED EXPERT
Top Expert 2014

Commented:
The solution to another question took more time than I anticipated.  Most of my 'day' is gone.  I'll be back tomorrow.  Pardon the delay.
CERTIFIED EXPERT
Top Expert 2014
Commented:
Please test this code.  You will be prompted for a folder name.
For testing purposes, you must create a Sheet1 worksheet.
Sub Q_29181125()
    Dim strFilename As String
    Dim vParentDirectory As Variant
    Dim strData As String
    Dim oRE As Object
    Dim oMatches As Object
    Dim oM As Object
    Dim lngSM As Long
    Dim lngRow
    Dim vParsedData As Variant
    
    Dim oDic As Object
    
    Dim wks As Worksheet
    Dim rng As Range

'Host=strFilename
'Local Interface=Interface
'Remote Interface=Port ID (outgoing port)
'Remote Host=Device ID
'Remote IP=IP Address
'Remote Platform=Platform
    
    Set oDic = CreateObject("system.collections.hashtable")
    oDic.Add "Interface", 1
    oDic.Add "Port ID (outgoing port)", 2
    oDic.Add "Device ID", 3
    oDic.Add "IP address", 4
    oDic.Add "Platform", 5
    
    Set wks = Worksheets("Sheet1")
    Set rng = wks.Range("A2")
    
    
    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "(Device ID|IP address|Platform|Interface|Port ID \(outgoing port\)):\s+(.+?)(?:,|\r\n)"
    
    With Application.FileDialog(msoFileDialogFolderPicker)  '("Select parent directory", "Select Parent")
        If .Show = True Then
            vParentDirectory = .SelectedItems(1)
        Else
            MsgBox "Please try again"
            Exit Sub
        End If
    End With
    If Len(vParentDirectory) = 0 Then
        MsgBox "Please try again"
        Exit Sub
    End If
    '====================================
    Application.ScreenUpdating = False
    
    strFilename = Dir(vParentDirectory & "\*.*", vbNormal)
    Do Until Len(strFilename) = 0
        Open vParentDirectory & "\" & strFilename For Input As #1
        strData = Input(LOF(1), #1)
        Close #1
        Debug.Print "Processing:", strFilename
        If oRE.test(strData) Then
            Set oMatches = oRE.Execute(strData)
            lngRow = 0
            For Each oM In oMatches
                If oM.submatches(0) = "Device ID" Then
                    lngRow = lngRow + 1
                End If
            Next
            ReDim vParsedData(1 To lngRow, 0 To 5)
            lngRow = 0
            For Each oM In oMatches
                If oM.submatches(0) = "Device ID" Then
                    lngRow = lngRow + 1
                    vParsedData(lngRow, 0) = strFilename
                End If
                vParsedData(lngRow, oDic(oM.submatches(0))) = oM.submatches(1)
                
            Next
        End If
        rng.Resize(lngRow, 6).Value = vParsedData
        Set rng = rng.Resize(1, 1)
        Set rng = rng.End(xlDown).Offset(1)
        strFilename = Dir
    Loop
    wks.Range("A1:F1").Value = Array("Host", "Local Interface", "Remote Interface", _
                                    "Remote Host", "Remote IP", "Remote Platform")

    Application.ScreenUpdating = True

End Sub

Open in new window

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts

Author

Commented:
Hi Aikimark
The script is working perfectly bringing up the folder and populating the sheet with the data. Great job. I tested on numerous folders with differing amounts of files and it works great. One thing i noticed and this is actually a good thing is that if a device has only 1 set of data to output i.e one connection, it stops and gives an error.
Thank you so much. For the final clean up and sorting do i just add the last part of the script or is that being cleaned also?

Once again thank you

Author

Commented:
Hi Aikimark
Did a quick merge of my old script with yours and this is the end result which is working exactly as planned in a few seconds
Thank you very much
Sub CDP_Import()
    Dim strFilename As String
    Dim vParentDirectory As Variant
    Dim strData As String
    Dim oRE As Object
    Dim oMatches As Object
    Dim oM As Object
    Dim lngSM As Long
    Dim lngRow
    Dim vParsedData As Variant
   
    Dim oDic As Object
   
    Dim wks As Worksheet
    Dim rng As Range

'Host=strFilename
'Local Interface=Interface
'Remote Interface=Port ID (outgoing port)
'Remote Host=Device ID
'Remote IP=IP Address
'Remote Platform=Platform
   
    Set oDic = CreateObject("system.collections.hashtable")
    oDic.Add "Interface", 1
    oDic.Add "Port ID (outgoing port)", 2
    oDic.Add "Device ID", 3
    oDic.Add "IP address", 4
    oDic.Add "Platform", 5
   
    Set wks = Worksheets("Sheet1")
    Set rng = wks.Range("A2")
   
   
    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "(Device ID|IP address|Platform|Interface|Port ID \(outgoing port\)):\s+(.+?)(?:,|\r\n)"
   
    With Application.FileDialog(msoFileDialogFolderPicker)  '("Select parent directory", "Select Parent")
        If .Show = True Then
            vParentDirectory = .SelectedItems(1)
        Else
            MsgBox "Please try again"
            Exit Sub
        End If
    End With
    If Len(vParentDirectory) = 0 Then
        MsgBox "Please try again"
        Exit Sub
    End If
    '====================================
    Application.ScreenUpdating = False
   
    strFilename = Dir(vParentDirectory & "\*.*", vbNormal)
    Do Until Len(strFilename) = 0
        Open vParentDirectory & "\" & strFilename For Input As #1
        strData = Input(LOF(1), #1)
        Close #1
        Debug.Print "Processing:", strFilename
        If oRE.Test(strData) Then
            Set oMatches = oRE.Execute(strData)
            lngRow = 0
            For Each oM In oMatches
                If oM.submatches(0) = "Device ID" Then
                    lngRow = lngRow + 1
                End If
            Next
            ReDim vParsedData(1 To lngRow, 0 To 5)
            lngRow = 0
            For Each oM In oMatches
                If oM.submatches(0) = "Device ID" Then
                    lngRow = lngRow + 1
                    vParsedData(lngRow, 0) = strFilename
                End If
                vParsedData(lngRow, oDic(oM.submatches(0))) = oM.submatches(1)
               
            Next
        End If
        rng.Resize(lngRow, 6).Value = vParsedData
        Set rng = rng.Resize(1, 1)
        Set rng = rng.End(xlDown).Offset(1)
        strFilename = Dir
    Loop
    wks.Range("A1:F1").Value = Array("Host", "Local Interface", "Remote Interface", _
                                    "Remote Host", "Remote IP", "Remote Platform")

    Columns("B:B").Select
    Selection.Copy
    Columns("G:G").Select
    ActiveSheet.Paste
    Application.CutCopyMode = False
    Selection.TextToColumns Destination:=Range("G1"), DataType:=xlDelimited, _
        TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=True, _
        Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar _
        :="/", FieldInfo:=Array(Array(1, 1), Array(2, 1), Array(3, 1)), _
        TrailingMinusNumbers:=True
    Selection.Replace What:="*net", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Selection.Replace What:="*i", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    'sort by interface
ActiveSheet.Name = "CDP Info"
    Range("A1").CurrentRegion.Select
Range("G1").Select
    ActiveCell.FormulaR1C1 = "1"
    Range("H1").Select
    ActiveCell.FormulaR1C1 = "2"
    Range("I1").Select
    ActiveCell.FormulaR1C1 = "3"
'
    ActiveWorkbook.Worksheets("CDP Info").Sort.SortFields.Clear
    ActiveWorkbook.Worksheets("CDP Info").Sort.SortFields.Add2 Key:=Range("A2:A2000" _
        ), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    ActiveWorkbook.Worksheets("CDP Info").Sort.SortFields.Add2 Key:=Range("G2:G2000" _
        ), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    ActiveWorkbook.Worksheets("CDP Info").Sort.SortFields.Add2 Key:=Range("H2:H2000" _
        ), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    ActiveWorkbook.Worksheets("CDP Info").Sort.SortFields.Add2 Key:=Range("I2:I2000" _
        ), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    With ActiveWorkbook.Worksheets("CDP Info").Sort
        .SetRange Range("A1:I2000")
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
Range("A:I").Columns.AutoFit
Call RemoveBlankRowsColumns
'Range("A1").Select
'Range(Selection, Selection.End(xlToRight)).Select
'Range(Selection, Selection.End(xlDown)).Select
Range("A1").CurrentRegion.Select

    Cells.Replace What:=".log*", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Cells.Replace What:=".eu*", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Cells.Replace What:=".na*", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Cells.Replace What:="cisco ", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Cells.Replace What:="GigabitEthernet", Replacement:="Gi", LookAt:=xlPart _
        , SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Cells.Replace What:="FastEthernet", Replacement:="Fa", LookAt:=xlPart _
        , SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Cells.Replace What:="TenGigabitEthernet", Replacement:="Te", LookAt:=xlPart _
        , SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False
    Selection.Borders(xlDiagonalDown).LineStyle = xlNone
    Selection.Borders(xlDiagonalUp).LineStyle = xlNone
    With Selection.Borders(xlEdgeLeft)
        .LineStyle = xlContinuous
        .ColorIndex = 0
        .TintAndShade = 0
        .Weight = xlThin
    End With
    With Selection.Borders(xlEdgeTop)
        .LineStyle = xlContinuous
        .ColorIndex = 0
        .TintAndShade = 0
        .Weight = xlThin
    End With
    With Selection.Borders(xlEdgeBottom)
        .LineStyle = xlContinuous
        .ColorIndex = 0
        .TintAndShade = 0
        .Weight = xlThin
    End With
    With Selection.Borders(xlEdgeRight)
        .LineStyle = xlContinuous
        .ColorIndex = 0
        .TintAndShade = 0
        .Weight = xlThin
    End With
    With Selection.Borders(xlInsideVertical)
        .LineStyle = xlContinuous
        .ColorIndex = 0
        .TintAndShade = 0
        .Weight = xlThin
    End With
    With Selection.Borders(xlInsideHorizontal)
        .LineStyle = xlContinuous
        .ColorIndex = 0
        .TintAndShade = 0
        .Weight = xlThin
    End With
     Application.ScreenUpdating = True
Cells(2, 1).Select
ActiveWindow.FreezePanes = True

Columns("D:F").Select
    Selection.Copy
    Sheets.Add After:=ActiveSheet
    ActiveSheet.Name = "CDP Host"
    Range("A1").Select
    ActiveSheet.Paste
    Range("A1:C1").Select
    Selection.Replace What:="Remote ", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByColumns, MatchCase:=True, SearchFormat:=False, _
        ReplaceFormat:=False, FormulaVersion:=xlReplaceFormula2
    Range("A1:C1").Select
    Range(Selection, Selection.End(xlDown)).Select
    Application.CutCopyMode = False
 'Sub RemoveDuplicatesAndSort()
    Dim colArr As Variant
    With Range("A1").CurrentRegion
        colArr = Evaluate("column(" & Range("A1").Resize(, .Columns.Count).Address & ")")
        ReDim Preserve colArr(0 To UBound(colArr) - 1)
        .RemoveDuplicates Columns:=(colArr), Header:=xlYes
        .Sort key1:=Range("A2"), order1:=xlAscending, Header:=xlYes
    End With
Cells(2, 1).Select
ActiveWindow.FreezePanes = True
End Sub



CERTIFIED EXPERT
Top Expert 2014

Commented:
if a device has only 1 set of data to output i.e one connection, it stops and gives an error
Please post such a file.  I'll test it against my import code.  I don't know how such a file would cause an error.
CERTIFIED EXPERT
Top Expert 2014
I think I know what is going wrong when a single-item file is processed.  The code is repositioning the rng variable to the next empty cell after the last non-empty cell with this statement:
        Set rng = rng.End(xlDown).Offset(1)

Open in new window


The way it is written, the .End(xlDown) method will go to the bottom (last row) in the worksheet if rng only had one row of data.  The .Offset(1) method will then reference a row beyond the end of the worksheet.

I think this would cover both cases.
        Set rng = wks.cells(wks.rows.count, 1).End(xlUp).Offset(1) 

Open in new window

aikimark
thanks for the follow up and the brilliant resolution to this
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.