[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

VBScript to remove Excel Duplicates

Posted on 2011-04-29
9
Medium Priority
?
2,368 Views
Last Modified: 2012-05-11
Hi,

Asking for assistance with this script to add duplicate removal functionality.  The goal would be to remove the entire row if any duplicates occur within the "Account" field.  

The most important criteria though is that the "Levy" field continue to drop anything below the 2500 threshold, which it is doing, so any duplicate removing logic should not interfere with that.

Had to great experts assist previously who did wonderful getting it to this point.

Thanks

J
' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
Const xlSaveChanges = 1 'Excel Save Changes
Const xlAscending = 1 'Excel Sort Ascending
Const xlDescending = 2 'Excel Sort Decending
Const xlYes = 1 'Excel Header Row Exists

cTab = Chr(9)

' Get input and output file names from command line parms
If (WScript.Arguments.Count > 0) Then
    sInfile = WScript.Arguments(0)
Else
    WScript.Echo "No input filename specified."
    WScript.Quit
End If
If (WScript.Arguments.Count > 1) Then
    sOutfile = WScript.Arguments(1)
Else
    WScript.Echo "No output filename specified."
    WScript.Quit
End If

'Create Excel Object
Set objExcel = CreateObject("Excel.Application")
objExcel.DisplayAlerts = False 'Disable Overwrite Prompts
objExcel.Visible = True 'Hides Excel window

'Create Excel worksheet and add header row
objExcel.Workbooks.Add()
objExcel.Cells(1, 1) = "Account"
objExcel.Cells(1, 2) = "Year"
objExcel.Cells(1, 3) = "LevyBalance"
objExcel.Cells(1, 4) = "Owner"
objExcel.Cells(1, 5) = "Address2"
objExcel.Cells(1, 6) = "Address3"
objExcel.Cells(1, 7) = "Address4"
objExcel.Cells(1, 8) = "City"
objExcel.Cells(1, 9) = "State"
objExcel.Cells(1, 10) = "ZipCode"
intRow = 2 'Set first row to add data

' Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")

' Read entire input file into a variable and close it
Set oInfile = oFSO.OpenTextFile(sInfile, ForReading, False, TriStateUseDefault)

Do While Not oInfile.AtEndOfStream
    sLine = oInfile.ReadLine
    Account = Trim(Mid(sLine, 1, 22))
    TYear = Trim(Mid(sLine, 23, 4))
    Over65 = Trim(Mid(sLine, 65, 1))
    Veteran = Trim(Mid(sLine, 66, 1))
    Disabled = Trim(Mid(sLine, 67, 1))
    DatePaid = Trim(Mid(sLine, 69, 8))
    LevyBalance = Trim(Mid(sLine, 87, 11))
    BankRupt = Trim(Mid(sLine, 139, 1))
    Owner = Trim(Mid(sLine, 202, 40))
    Address2 = Trim(Mid(sLine, 242, 40))
    Address3 = Trim(Mid(sLine, 282, 40))
    Address4 = Trim(Mid(sLine, 322, 40))
    City = Trim(Mid(sLine, 362, 24))
    State = Trim(Mid(sLine, 386, 2))
    ZipCode = Mid(sLine, 388, 5) & "-" & Mid(sLine, 393, 4)
           
    If (TYear <> "" And CLng(TYear) < 2010) And Over65 = "" And Veteran = "" And Disabled = "" And DatePaid = "" And ( LevyBalance <> "" And CLng(LevyBalance) >= 2500 ) And BankRupt = "" Then
        objExcel.Cells(intRow, 1) = Account
        objExcel.Cells(intRow, 2) = TYear
        objExcel.Cells(intRow, 3) = LevyBalance
        objExcel.Cells(intRow, 4) = Owner
        objExcel.Cells(intRow, 5) = Address2
        objExcel.Cells(intRow, 6) = Address3
        objExcel.Cells(intRow, 7) = Address4
        objExcel.Cells(intRow, 8) = City
        objExcel.Cells(intRow, 9) = State
        objExcel.Cells(intRow, 10) = ZipCode
        intRow = intRow + 1
    End If
Loop

'Format and Sort Excel workbook

Set objSheet = objExcel.ActiveWorkbook.ActiveSheet
objSheet.Columns("A:A").NumberFormat = "0"
Set objRange = objSheet.UsedRange
objRange.EntireColumn.AutoFit()
For iW = objRange.Column To objRange.Columns.Count
    colWidth = objSheet.Columns(iW).ColumnWidth
    objSheet.Columns(iW).ColumnWidth = colWidth + 5
Next
        
'Save Excel Workbook
objExcel.ActiveWorkbook.SaveAs(sOutfile)
'objExcel.ActiveWorkbook.Close xlSaveChanges
'objExcel.Quit
WScript.Echo "Script Finished"

' Cleanup and end
oInfile.Close
Set oInfile = Nothing
Set oFSO = Nothing

Open in new window

0
Comment
Question by:jon1966
  • 5
  • 4
9 Comments
 
LVL 14

Expert Comment

by:sungenwang
ID: 35490945
can you attach a sample file?
sew
0
 
LVL 3

Author Comment

by:jon1966
ID: 35491212
Sure, here you go, thank you.
sample.txt
0
 
LVL 14

Expert Comment

by:sungenwang
ID: 35492466
OK. I have added a dictionary object to your code that will dicard any duplicate "Account" field.
sew

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
Const xlSaveChanges = 1 'Excel Save Changes
Const xlAscending = 1 'Excel Sort Ascending
Const xlDescending = 2 'Excel Sort Decending
Const xlYes = 1 'Excel Header Row Exists

cTab = Chr(9)

' Get input and output file names from command line parms
If (WScript.Arguments.Count > 0) Then
    sInfile = WScript.Arguments(0)
Else
    WScript.Echo "No input filename specified."
    WScript.Quit
End If
If (WScript.Arguments.Count > 1) Then
    sOutfile = WScript.Arguments(1)
Else
    WScript.Echo "No output filename specified."
    WScript.Quit
End If

'Create Excel Object
Set objExcel = CreateObject("Excel.Application")
objExcel.DisplayAlerts = False 'Disable Overwrite Prompts
objExcel.Visible = True 'Hides Excel window

'Create Excel worksheet and add header row
objExcel.Workbooks.Add()
objExcel.Cells(1, 1) = "Account"
objExcel.Cells(1, 2) = "Year"
objExcel.Cells(1, 3) = "LevyBalance"
objExcel.Cells(1, 4) = "Owner"
objExcel.Cells(1, 5) = "Address2"
objExcel.Cells(1, 6) = "Address3"
objExcel.Cells(1, 7) = "Address4"
objExcel.Cells(1, 8) = "City"
objExcel.Cells(1, 9) = "State"
objExcel.Cells(1, 10) = "ZipCode"
intRow = 2 'Set first row to add data

' Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")

' Read entire input file into a variable and close it
Set oInfile = oFSO.OpenTextFile(sInfile, ForReading, False, TriStateUseDefault)

Set objDic = CreateObject("Scripting.Dictionary")
    
Do While Not oInfile.AtEndOfStream
    sLine = oInfile.ReadLine

    Account = Trim(Mid(sLine, 1, 22))
    TYear = Trim(Mid(sLine, 23, 4))
    Over65 = Trim(Mid(sLine, 65, 1))
    Veteran = Trim(Mid(sLine, 66, 1))
    Disabled = Trim(Mid(sLine, 67, 1))
    DatePaid = Trim(Mid(sLine, 69, 8))
    LevyBalance = Trim(Mid(sLine, 87, 11))
    BankRupt = Trim(Mid(sLine, 139, 1))
    Owner = Trim(Mid(sLine, 202, 40))
    Address2 = Trim(Mid(sLine, 242, 40))
    Address3 = Trim(Mid(sLine, 282, 40))
    Address4 = Trim(Mid(sLine, 322, 40))
    City = Trim(Mid(sLine, 362, 24))
    State = Trim(Mid(sLine, 386, 2))
    ZipCode = Mid(sLine, 388, 5) & "-" & Mid(sLine, 393, 4)
           
    If Not objDic.Exists(Account) Then
	objDic.Add Account, Account

        If (TYear <> "" And CLng(TYear) < 2010) And Over65 = "" And Veteran = "" And Disabled = "" And DatePaid = "" And ( LevyBalance <> "" And CLng(LevyBalance) >= 2500 ) And BankRupt = "" Then
            objExcel.Cells(intRow, 1) = Account
            objExcel.Cells(intRow, 2) = TYear
            objExcel.Cells(intRow, 3) = LevyBalance
            objExcel.Cells(intRow, 4) = Owner
            objExcel.Cells(intRow, 5) = Address2
            objExcel.Cells(intRow, 6) = Address3
            objExcel.Cells(intRow, 7) = Address4
            objExcel.Cells(intRow, 8) = City
            objExcel.Cells(intRow, 9) = State
            objExcel.Cells(intRow, 10) = ZipCode
            intRow = intRow + 1
        End If
    End If
Loop

'Format and Sort Excel workbook

Set objSheet = objExcel.ActiveWorkbook.ActiveSheet
objSheet.Columns("A:A").NumberFormat = "0"
Set objRange = objSheet.UsedRange
objRange.EntireColumn.AutoFit()
For iW = objRange.Column To objRange.Columns.Count
    colWidth = objSheet.Columns(iW).ColumnWidth
    objSheet.Columns(iW).ColumnWidth = colWidth + 5
Next
        
'Save Excel Workbook
objExcel.ActiveWorkbook.SaveAs(sOutfile)
'objExcel.ActiveWorkbook.Close xlSaveChanges
'objExcel.Quit
WScript.Echo "Script Finished"

' Cleanup and end
oInfile.Close
Set oInfile = Nothing
Set oFSO = Nothing

Open in new window

0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 14

Expert Comment

by:sungenwang
ID: 35492472
' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
Const xlSaveChanges = 1 'Excel Save Changes
Const xlAscending = 1 'Excel Sort Ascending
Const xlDescending = 2 'Excel Sort Decending
Const xlYes = 1 'Excel Header Row Exists

cTab = Chr(9)

' Get input and output file names from command line parms
If (WScript.Arguments.Count > 0) Then
    sInfile = WScript.Arguments(0)
Else
    WScript.Echo "No input filename specified."
    WScript.Quit
End If
If (WScript.Arguments.Count > 1) Then
    sOutfile = WScript.Arguments(1)
Else
    WScript.Echo "No output filename specified."
    WScript.Quit
End If

'Create Excel Object
Set objExcel = CreateObject("Excel.Application")
objExcel.DisplayAlerts = False 'Disable Overwrite Prompts
objExcel.Visible = True 'Hides Excel window

'Create Excel worksheet and add header row
objExcel.Workbooks.Add()
objExcel.Cells(1, 1) = "Account"
objExcel.Cells(1, 2) = "Year"
objExcel.Cells(1, 3) = "LevyBalance"
objExcel.Cells(1, 4) = "Owner"
objExcel.Cells(1, 5) = "Address2"
objExcel.Cells(1, 6) = "Address3"
objExcel.Cells(1, 7) = "Address4"
objExcel.Cells(1, 8) = "City"
objExcel.Cells(1, 9) = "State"
objExcel.Cells(1, 10) = "ZipCode"
intRow = 2 'Set first row to add data

' Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")

' Read entire input file into a variable and close it
Set oInfile = oFSO.OpenTextFile(sInfile, ForReading, False, TriStateUseDefault)

Set objDic = CreateObject("Scripting.Dictionary")
   
Do While Not oInfile.AtEndOfStream
    sLine = oInfile.ReadLine

    Account = Trim(Mid(sLine, 1, 22))
    TYear = Trim(Mid(sLine, 23, 4))
    Over65 = Trim(Mid(sLine, 65, 1))
    Veteran = Trim(Mid(sLine, 66, 1))
    Disabled = Trim(Mid(sLine, 67, 1))
    DatePaid = Trim(Mid(sLine, 69, 8))
    LevyBalance = Trim(Mid(sLine, 87, 11))
    BankRupt = Trim(Mid(sLine, 139, 1))
    Owner = Trim(Mid(sLine, 202, 40))
    Address2 = Trim(Mid(sLine, 242, 40))
    Address3 = Trim(Mid(sLine, 282, 40))
    Address4 = Trim(Mid(sLine, 322, 40))
    City = Trim(Mid(sLine, 362, 24))
    State = Trim(Mid(sLine, 386, 2))
    ZipCode = Mid(sLine, 388, 5) & "-" & Mid(sLine, 393, 4)
           
    If Not objDic.Exists(Account) Then
      objDic.Add Account, Account

        If (TYear <> "" And CLng(TYear) < 2010) And Over65 = "" And Veteran = "" And Disabled = "" And DatePaid = "" And ( LevyBalance <> "" And CLng(LevyBalance) >= 2500 ) And BankRupt = "" Then
            objExcel.Cells(intRow, 1) = Account
            objExcel.Cells(intRow, 2) = TYear
            objExcel.Cells(intRow, 3) = LevyBalance
            objExcel.Cells(intRow, 4) = Owner
            objExcel.Cells(intRow, 5) = Address2
            objExcel.Cells(intRow, 6) = Address3
            objExcel.Cells(intRow, 7) = Address4
            objExcel.Cells(intRow, 8) = City
            objExcel.Cells(intRow, 9) = State
            objExcel.Cells(intRow, 10) = ZipCode
            intRow = intRow + 1
        End If
    End If
Loop

'Format and Sort Excel workbook

Set objSheet = objExcel.ActiveWorkbook.ActiveSheet
objSheet.Columns("A:A").NumberFormat = "0"
Set objRange = objSheet.UsedRange
objRange.EntireColumn.AutoFit()
For iW = objRange.Column To objRange.Columns.Count
    colWidth = objSheet.Columns(iW).ColumnWidth
    objSheet.Columns(iW).ColumnWidth = colWidth + 5
Next
       
'Save Excel Workbook
objExcel.ActiveWorkbook.SaveAs(sOutfile)
'objExcel.ActiveWorkbook.Close xlSaveChanges
'objExcel.Quit
WScript.Echo "Script Finished"

' Cleanup and end
oInfile.Close
Set oInfile = Nothing
Set oFSO = Nothing
0
 
LVL 3

Author Comment

by:jon1966
ID: 35494989
Hi,

Thank you.  If I take the original script, and process the file, the output is about 15,000 rows.  If I then sort by the "account" column, then use the duplicate removal feature within Excel 2010, to remove duplicates  based on the "account" field only, the result is 5672 rows remaining.

If I process the same file, with your code, the result is 2430 rows remaining.

Is there a private way to get you in touch with the full file?

J
0
 
LVL 14

Expert Comment

by:sungenwang
ID: 35495276
you can email it to  (remove the spaces) :

s u n g e n w a n g  

< a t > 

h o t m a i l

{ d o t }

 c o m
0
 
LVL 3

Author Comment

by:jon1966
ID: 35516214
Hi, any luck?
0
 
LVL 14

Accepted Solution

by:
sungenwang earned 2000 total points
ID: 35535331
This should do it... I had to set the dictionary key to a string type - CStr(Account) - due to the huge integer number.

sew

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
Const xlSaveChanges = 1 'Excel Save Changes
Const xlAscending = 1 'Excel Sort Ascending
Const xlDescending = 2 'Excel Sort Decending
Const xlYes = 1 'Excel Header Row Exists

cTab = Chr(9)

' Get input and output file names from command line parms
If (WScript.Arguments.Count > 0) Then
    sInfile = WScript.Arguments(0)
Else
    WScript.Echo "No input filename specified."
    WScript.Quit
End If
If (WScript.Arguments.Count > 1) Then
    sOutfile = WScript.Arguments(1)
Else
    WScript.Echo "No output filename specified."
    WScript.Quit
End If

'Create Excel Object
Set objExcel = CreateObject("Excel.Application")
objExcel.DisplayAlerts = False 'Disable Overwrite Prompts
objExcel.Visible = True 'Hides Excel window

'Create Excel worksheet and add header row
objExcel.Workbooks.Add()
objExcel.Cells(1, 1) = "Account"
objExcel.Cells(1, 2) = "Year"
objExcel.Cells(1, 3) = "LevyBalance"
objExcel.Cells(1, 4) = "Owner"
objExcel.Cells(1, 5) = "Address2"
objExcel.Cells(1, 6) = "Address3"
objExcel.Cells(1, 7) = "Address4"
objExcel.Cells(1, 8) = "City"
objExcel.Cells(1, 9) = "State"
objExcel.Cells(1, 10) = "ZipCode"
intRow = 2 'Set first row to add data

' Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")

' Read entire input file into a variable and close it
Set oInfile = oFSO.OpenTextFile(sInfile, ForReading, False, TriStateUseDefault)

Set objDic = CreateObject("Scripting.Dictionary")
    
intcount = 0
Do While Not oInfile.AtEndOfStream
    sLine = oInfile.ReadLine

    Account = Trim(Mid(sLine, 1, 22))
    TYear = Trim(Mid(sLine, 23, 4))
    Over65 = Trim(Mid(sLine, 65, 1))
    Veteran = Trim(Mid(sLine, 66, 1))
    Disabled = Trim(Mid(sLine, 67, 1))
    DatePaid = Trim(Mid(sLine, 69, 8))
    LevyBalance = Trim(Mid(sLine, 87, 11))
    BankRupt = Trim(Mid(sLine, 139, 1))
    Owner = Trim(Mid(sLine, 202, 40))
    Address2 = Trim(Mid(sLine, 242, 40))
    Address3 = Trim(Mid(sLine, 282, 40))
    Address4 = Trim(Mid(sLine, 322, 40))
    City = Trim(Mid(sLine, 362, 24))
    State = Trim(Mid(sLine, 386, 2))
    ZipCode = Mid(sLine, 388, 5) & "-" & Mid(sLine, 393, 4)
           
    If (TYear <> "" And CLng(TYear) < 2010) And Over65 = "" And Veteran = "" And Disabled = "" And DatePaid = "" And ( LevyBalance <> "" And CLng(LevyBalance) >= 2500 ) And BankRupt = "" Then
        If Not objDic.Exists(CStr(Account)) Then

	    objDic.Add CStr(Account), Account

            objExcel.Cells(intRow, 1) = Account
            objExcel.Cells(intRow, 2) = TYear
            objExcel.Cells(intRow, 3) = LevyBalance
            objExcel.Cells(intRow, 4) = Owner
            objExcel.Cells(intRow, 5) = Address2
            objExcel.Cells(intRow, 6) = Address3
            objExcel.Cells(intRow, 7) = Address4
            objExcel.Cells(intRow, 8) = City
            objExcel.Cells(intRow, 9) = State
            objExcel.Cells(intRow, 10) = ZipCode

            intRow = intRow + 1
        End If
    End If
Loop

Set objDic = Nothing

'Format and Sort Excel workbook

Set objSheet = objExcel.ActiveWorkbook.ActiveSheet
objSheet.Columns("A:A").NumberFormat = "0"
Set objRange = objSheet.UsedRange
objRange.EntireColumn.AutoFit()
For iW = objRange.Column To objRange.Columns.Count
    colWidth = objSheet.Columns(iW).ColumnWidth
    objSheet.Columns(iW).ColumnWidth = colWidth + 5
Next
        
'Save Excel Workbook
objExcel.ActiveWorkbook.SaveAs(sOutfile)
'objExcel.ActiveWorkbook.Close xlSaveChanges
'objExcel.Quit
WScript.Echo "Script Finished"

' Cleanup and end
oInfile.Close
Set oInfile = Nothing
Set oFSO = Nothing

Open in new window

0
 
LVL 3

Author Comment

by:jon1966
ID: 35705957
Works great, thank you.
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When it comes to writing scripts for a Client/Server computing environment it is essential to consider some way of enabling the authentication functionality within a script. This sort of consideration mainly comes into the picture when we are dealin…
Deploying a Microsoft Access application in a Citrix environment is not difficult but takes a few steps. However, Citrix system people are often of little help, as they typically know next to nothing about Access. The script provided here will take …
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
When cloud platforms entered the scene, users and companies jumped on board to take advantage of the many benefits, like the ability to work and connect with company information from various locations. What many didn't foresee was the increased risk…
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question