Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 352
  • Last Modified:

I need a script to compare rows of results that can auto de-dupe via .VBS script for automation purposes

Hi all,

I am in the midst of seeking a script that can take a spread sheet in a given directory, open it, and check all the rows top heading down.  I would want the code to specify if there are exact matches to remove them on all worksheets (not just the active one) and then save the file as deduped.myfile.xls instead of over writting the original file.

Thanks for your help in advance!
0
smyers051972
Asked:
smyers051972
  • 7
  • 5
  • 4
  • +1
1 Solution
 
Dave BrettVice President - Business EvaluationCommented:
You can do the code part with my addin, http://xldynamic.com/source/xld.DupMaster.html

it will remove exact matches over all sheets (ie it doesn't just compare rows within a sheet)

Once downloaded the addin is run via
Tools - The Duplixate Master
See pic below for settings

The save as part will need to be done manually

Cheees

DAve

dm---all-rows.jpg
0
 
smyers051972Author Commented:
Hey thanks appreciate it.  I am seeking a .VBS script however. I need it to be fully executable through scripting this is for automation purposes.

Thank you!
0
 
Dave BrettVice President - Business EvaluationCommented:
Hi,

You should probably post this in the VBscript area.

I understand your need for automation, but this is a pretty significant piece of code to build from scratch

Cheers

Dave
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
smyers051972Author Commented:
How can I change the zones?
0
 
Dave BrettVice President - Business EvaluationCommented:
Pls hit the Request Attention button at top right

Cheers

Dave
0
 
Patrick MatthewsCommented:
Hello smyers051972,

Please specify what you mean by "exact match", and provide a sample file as an example.  EE now
allows you to directly upload files to your question.

Please be advised that once you upload a file, it can be publicly accessed, and that it
may not be possible to fully and permanently delete it.  Therefore, be very careful about
posting proprietary, confidential, or other sensitive information.  If necessary, use "fake"
and/or obfuscated data in your sample.

Regards,

Patrick
0
 
smyers051972Author Commented:
It would be a mailing list, when I say exact match I am meaning

#     First Name        Last Name                 address                        city          state           zip          phone#  

1     John                  Doe                           1234 Street Way          City         State           12345    305-555-0555
2     Jane                  Doe                           4321 Street Way          Another   State          13245    786-555-0556
3     Jane                  Doe                           5321 Street Way          ACity        State          23245    306-555-0556
4     John                  Doe                           1234 Street Way          City         State           12345    305-555-0555
5     Jane                  Doe                           4321 Street Way          Another   State          13245    786-555-0556

In this example items 1,4,5 are dupes because they match another field so they should be deleted, number 2 is close but not an exact match. so if anything of those columns doesnt match completely then they should stay for manual review. There will be other columns such as statistics etc but the first set of columns shown above will be used for duplucate matching: first/last name, address, city, state, zip, and phone number.

The reason why I want to have a complete match is because they may be two different people, we are pulling from 8 different database sources (all sql 2000) and merging to one sheet.  What I want the script to do is check all worksheets and find the duplicates and delete them then save the file as deduped.myfile.xls

Thank you!
0
 
ahammarCommented:
Do you need to find dupes between sheets also, or just dupes that are on each sheet.  In other words, if there is an entry on the second sheet that is a duplicate of one on the first sheet, do you need that one deleted?
0
 
Patrick MatthewsCommented:
smyers051972 said:
>>The reason why I want to have a complete match is because they may be two different people, we are
>>pulling from 8 different database sources (all sql 2000) and merging to one sheet.  

I should think you would be better off doing the duplicate checking on the SQL Server.  How are you
gathering the data from there?
0
 
ahammarCommented:
Here is code and a working workbook that worked for me...
Note:
When you start the macro, it asks which file to check for dupes

It simply checks the first 7 columns of each row (A through G) for duplicaes.  It does not check any farther than that.

For now, there must be a value in every row of column A..(no blanks in column A until after the last row of data)

It saves the new file as: deduped.myfile.xls  in the same folder as the file you opened to check for dupes

It checks every sheet, but it does not cross reference sheets.  In other words, when it changes sheets, it starts over.  It does not compare any data in 1 sheet, with data in another sheet.  This can be easily changed simply by removing the 3 lines that empty the collection object.

Here is the code and a working file.  You can download the file and just run the macro...

If you have any questions or need something changed...let me know
:-)
Albert
Here is the workbook and the code:



Public Sub DeleteDupes()
 
    Dim OtherWb As Workbook
    Dim FileName As String
    Dim r As Range
    Dim rval As String
    Dim collString As Collection
    Dim n As Integer
    Dim x as Integer
    Dim sh As Worksheet
    
    
    Set collString = New Collection
   
FileName = Application.GetOpenFilename("Excel Workbook Files (*.xls), *.xls")
   
Application.ScreenUpdating = False
If FileName <> "False" Then
      Set OtherWb = Workbooks.Open(FileName, False, True)
 
        For Each sh In OtherWb.Sheets
              Set r = sh.Range("A1")
            
                On Error GoTo errhandler
                Do Until r = ""
                    rval = ""
                    For n = 0 To 6
                        rval = rval & Trim(r.Offset(0, n).Value)
                    Next n
                    collString.Add rval, rval
                    Set r = r.Offset(1, 0)
cont:
                Loop
                For x = 1 To collString.Count
                    collString.Remove 1
                Next x
                
        Next sh
 
    OtherWb.SaveAs OtherWb.Path & "\deduped.myfile.xls" ', 56
    OtherWb.Close False
        
Exit Sub
 
errhandler:
    If Err.Number = 457 Then
        Set r = r.Offset(1, 0)
        r.Offset(-1, 0).EntireRow.Delete xlUp
        Resume cont
    Else
        MsgBox Err.Description
        Exit Sub
    End If
    
End If
   
Application.ScreenUpdating = True
   
 
End Sub

Open in new window

DeleteDupes.xls
0
 
ahammarCommented:
remember to always make copies of your files before testing...
0
 
smyers051972Author Commented:
Thank you for the replies! I will test this out as a possible solution.

I wanted to touch on one thing though just in case a revision is needed and I am sure you will know this once I touch out on this.  We take all data from 8 different databases on 8 servers, reason why is each server is self contained because they are out at sea aboard a vessel.  Currently I have some VB scripts I got here that after I export to excel as 8 different files, I can take them all and merge to one single file, either 1 worksheet or tabbed for each ship.  I would probably like to check all work sheets.  So dupe check at the sql server would be impossible to do, we only have 384k via satellite :)

Thank you!
0
 
smyers051972Author Commented:
When you say I can run the macro is this ok from the command prompt? I am scripting it out with out running excel manually? i.e. I can save this as a .VBS?  also to specify the directory can I change *.xls (the second one shown) to C:\myfiles\*.xls ?


Thanks again!
0
 
ahammarCommented:
The code I provided is not script...it is vba code.  It is meant to be run directly from Excel.  I don't know anything about script.  There may be a way to run this from the command prompt, but if there is, I do not know...sorry about that...
As far as the VBA goes, if you want to specify a starting directory, then in line 14 of the code I posted above...(it is a blank line) put this:
chdir "C:\myfiles"

You will still be asked which file to open to check, but it will start in that folder.
0
 
smyers051972Author Commented:
Ahh darn!

I really need it to be .VBS format I was unaware that you were suggesting VBA, to use it I would need to actually be running excel and I am looking to streamline the entire process and have it save the file automatically.

I will see if theres a way to convert your script to VBS format in the mean time any help would be great for anyone that can create a .VBS script.

Thank you!
0
 
smyers051972Author Commented:
I found out from an article at: http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Controls/Q_23074943.html?sfQueryTermInfo=1+convert+vb+vba 

that you can not convert. I will close the question and accept your solution then post a new one in VB script section I think I miss categorized this in the first place.

Thank you for your help
0
 
ahammarCommented:
Hi smyers,
Thanks for the points and the grade!  I'm sorry it's not exactly what you wanted.  If you were not a premium member I would say have this question deleted and your points refunded.  You never have to accept a solution that does not work for you....plus you did specify script in your question.  I just wrote it in vba for 2 reasons....lots of times the asker thinks vba is called script, and the other is a lot of times the vba works for them just as well even if they do want script....but neither is the case for you....but since you are a premium member, it's a lot easier to just accept an answer than write to support and ask to have a question deleted...and you don't lose anything...
As far as that link you posted....that was not talking about .vbs script.  That was talking about converting to VB (Visual Basic).

I can take this code and convert it to VB with no problem and it would be a standalone app, but that's still not the same as script...
Let me know if that's what you want, and I'll do that for you.  I'll be gone until late tonight though, so it wouldn't be today...just so you'll know...

:-)
Albert
0
 
Dave BrettVice President - Business EvaluationCommented:
Good to see that you got this sorted with a vbs elsewhere. I had assumed that you had wanted to find duplicate rows betwen sheets - hence my significant comment - finding duplicates within a sheet is a much easier process

Regards

Dave
0

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

  • 7
  • 5
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now