Link to home
Create AccountLog in
Avatar of Mandy_
Mandy_

asked on

Powershell / Excel - Check if data match between 2 worksheets and set correct domain

this question refers to this already answered question

Dear folks,

It's regarding the already answered question (pls see link above)

The supplied script i've extended according to my requirements. The smtp addresses are create in this form as they should.

To complete my happiness there's a "small" mistake to fix. In the excel file are 2 worksheets
as you can see in the attached pictures. Now the code takes the domains from the worksheet
"company" only in the existing sort order of the cells B2-B4.  

What i exactly want:
1. Compare the data from worksheet "ED" column "company" (cell C2-C40)
 with worksheet "company" columns "company short" and "domain"
Cells A2-200 and B2-200.  

2. If the data matching e.g. (xxx) from "ED" with (xxx) (xxxexample.com)  
from worksheet company, attach the correct domain in field smtp (cell D2-D40) worksheet "ED"

pls note:  there are 200 different domains and shorties (xxx) to check if match or not
the pictures are only short examples


$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible=$true
$WorkBook = $objExcel.Workbooks.Open($strPath)
$worksheet = $workbook.sheets.item("EDM")
$worksheet2 = $workbook.sheets.item("COMPANY")
2..40 | %{
	$rowNum = $_
	$name = $worksheet.cells.item($rowNum,2).Value()
	$comp = $worksheet2.cells.item($rowNum,2).Value()

    #Select-String $worksheet2.cells.item($rowNum,2).Value() -Pattern "([a-z]+).+?=.+?([a-z.]+)" |
    #ForEach { $_.matches} |
    #ForEach { $tokens = @($_.groups[1].value,$_.groups[2].value); $data.Add($tokens[0],$tokens[1])}


    $name.ToLower() -match "^(?<last>[a-z -]+),( (?<middle>[a-z]+))? (?<first>[a-z-]+)( (?<code>[a-z0-9()]+))?$" | Out-Null
    $first = $matches["first"] -replace "[^a-z]","."
    $middle = $matches["middle"] -replace "[^a-z]","."
    $middle2 = $matches["middle"] -replace "[^a-z]","."
    $last = $matches["last"] -replace "[^a-z]","."
    $code = $matches["code"] -replace "[()]"

    #$comp = $data[$company]

    If ($middle)
    { $middle = "." + $middle }
    If ($code)
    { $code = "." + $code }
    
    
	$worksheet.cells.item($rowNum,4) = "{0}{1}{2}.{3}{4}@{5}" -f $first,$middle,$middle2,$last,$code,$comp.toLower()
}

Open in new window


User generated image

User generated image

User generated image
Avatar of David Johnson, CD
David Johnson, CD
Flag of Canada image

Sample data and existing scripts would be helpful and speed things up.. as files please
Avatar of Mandy_
Mandy_

ASKER

hi

pls find the requested files attached. Pls rename smtp_2_excel to *.ps1

thanks in advance for your help
data1.xlsx
smtp-2-excel
here:
cls
$strPath="c:\temp\data1.xlsx"
$domainMap = @{}
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible=$false
$WorkBook = $objExcel.Workbooks.Open($strPath)
$worksheetCompany = $workbook.sheets.item("COMPANY")
$worksheetED = $workbook.sheets.item("ED")

$rowNum = 2

do{
	$compshort = $worksheetCompany.cells.item($rowNum,1).Value()
	if($compshort -eq $null) {break}
	$domainMap[$compshort] = $worksheetCompany.cells.item($rowNum++,2).Value()
}while($true)

$rowNum = 2

do{
	$compshort = $worksheetED.cells.item($rowNum,3).Value()
	if($compshort -eq $null) {break}
	$domain = $domainMap[$compshort]
	$worksheetED.cells.item($rowNum++,4) = $domain
}while($true)

$WorkBook.Save()
$objExcel.Quit()

Open in new window

copy D:\Downloads\data1s.xlsx C:\temp\data1.xlsx
$strPath="c:\temp\data1.xlsx"
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible=$true
$WorkBook = $objExcel.Workbooks.Open($strPath)
$worksheet = $workbook.sheets.item("ED")
$worksheet2 = $workbook.sheets.item("COMPANY")

2..15 | %{
	$rowNum = $_
	$name = $worksheet.cells.item($rowNum,2).Value()
	$comp = $worksheet2.cells.item($rowNum,2).Value()

    $splitted = $name.Split(",")
    $maximum = $splitted.Count
    $email = $splitted[0]
        for ($counter = 1; $counter -le $maximum; $counter++) {
             if ($splitted[$counter] -ne $null){
                $email = $email + "." + $splitted[$counter]
                $email =  $email -replace " ",""
                }
            }
        $email = $email + $code +"@" + $comp
        $email = $email.ToLower()
    $worksheet.cells.item($rowNum,4) = "{0}" -f $email
    $email = ""
    $code = ""
}
$WorkBook.Save()
$objExcel.Quit()
#start $strPath

Open in new window

sedgwick,

Shouldn't you integrate that in the original source? What you posted last just retrieves the company part - what about the remainder of the email addresses?


Mandy_,

Usually I would recommend to use (only) Excel logic and formulas for performing this task, but let's continue with the PS-only approach. Putting some pieces together, plus improving the Excel access technique, this is my suggestion:
$strpath = 'C:\Temp\EE\data.xlsx'

$xls = New-Object -ComObject Excel14.Application
$xls.Visible = $true
$wb   = $xls.Workbooks.Open($strPath)
$wsED = $wb.Sheets.Item('ED')

# Fill the Dictionary or Hash - doesn't matter
$companys = @{}
$wb.Sheets.Item('Company').UsedRange.Rows | select -skip 1 | % { $_ } | ? { $_.Value2[1,1] } |
  % { $companys += @{$_.Value2[1,1] = $_.Value2[1,2]} }

# and now the worksheet
$wsED.UsedRange.Rows | select -Skip 1 | % { $_ } | ? { $_.Value2 } |
  % {
    $_.Value2[1,2].ToLower().ToLower() -match "^(?<last>[a-z -]+),( (?<middle>[a-z]+))? (?<first>[a-z-]+)( (?<code>[a-z0-9()]+))?$" | Out-Null
    $first, $middle, $middle2, $last = $matches["first", "middle", "middle2", "last"]  -replace "[^a-z]","."
    $code    = $matches["code"]   -replace "[()]"

    if ($middle)  { $middle = "." + $middle }
    if ($code)    { $code = "." + $code }
   
    $_.Cells.Item(4).Value2 = "{0}{1}{2}.{3}{4}@{5}" -f $first,$middle,$middle2,$last,$code,$companys[$_.Value2[1,3]].toLower()
}

$wb.Save()
$xls.Quit()

Open in new window

Note that it has NO restriction on how many rows in ED are filled out ...
@Qlemo

@Mandy_ represented specific requirement which i provided solution to.
i'd rather him test my code against the excel files first, just to make sure it works and it fully addresses the requirement.
once approved, i'll integrate it into his script.
That's important information you should provide with your answer. As posted, the code looks like the final solution.
you right, my bad.

@Mandy_
my code addresses the requirement you provided, i haven't merged your script with mine.
plz, test my code and once approved, i'll merge it with your current script.
cheers
Avatar of Mandy_

ASKER

@qlemo

User generated image
+   % { $companys += @{$_.Value2[1,1] = $_.Value2[1,2]} }
+                                 ~
"]" missing after Arrayindex.


+   % { $companys += @{$_.Value2[1,1] = $_.Value2[1,2]} }
+                                 ~
Operator "=" missing after key atHashliteral.
    + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : MissingEndSquareBracket

Open in new window

Avatar of Mandy_

ASKER

dear ve3ofa,

thank you for your work. unfortunately its only a part of the solution. The solution of qlemo
do it in one step. I will accept a multiple solution consider your post as well.
But first qlemo has to resolve the problem i still have with his code.

pls see by: Mandy_Posted on 2013-07-14 at 18:28:36 ID: 39324939
Mandy_,
That's difficult, as the code works for me. I have to guess why the cells array should not be two-dimensional.
The only error I found yet is the New-Object line (includes the Excel version number in a wrong notation). Line 3 should be
$xls = New-Object -ComObject Excel.Application

Open in new window

but that error results in having many error messages displayed, and never the one you presented. Currently testing further options ...
Tried again, with the original downloaded data1.xlsx, and no issues with the code :/
Avatar of Mandy_

ASKER

I tried now at home again but still the same. The debugger shows problems between # #
in the code below. Tomorrow i will check that at my company.

User generated image
$wb.Sheets.Item('Company').UsedRange.Rows | select -skip 1 | % { $_ } | ? { $_.Value2[1,1] } | % #{# $companys += @{$_.Value2[1#,#1] = $_.Value2[1,2]} }

Open in new window

For testing only, does this code work for you (not resulting in errors)?
$strpath = 'C:\Temp\EE\data1.xlsx'

$xls = New-Object -ComObject Excel.Application
$xls.Visible = $true
$wb   = $xls.Workbooks.Open($strPath)
$wsED = $wb.Sheets.Item('ED')

# Fill the Dictionary or Hash - doesn't matter
$companys = @{}
$wb.Sheets.Item('Company').UsedRange.Rows | select -skip 1 | % { $_ } | ? { $_.Cells.Item(1,1).Value2 } |
  % { $companys += @{$_.Cells.Item(1,1).Value2 = $_.Cells.Item(1,2).Value2} }

Open in new window

I've skipped the real work part, filling out the ED worksheet. If this code works, I'll rearrange that part too.
Got past the first part.. now that error is gone.  You code fails and is unbelievably slow
Checkout the speed and pass/fail of the code in this video

Sedgewick 2:45 and fails
Qlemo        6: 43 and fails
My Code    0:29 passes both and also with the 2 middle names
ve3ofa,

Yes, my code is very slow, in particular reading the Company worksheet into an hash table.
But your code does not match the company's three letter shortcut with the Company worksheet - it simply matches row X of each worksheet. Of course that is fastest. And you are limiting the number of rows processed to 14 (2 .. 15), which is certainly too restrictive for practical use.

Further, I just used the code for splitting name parts provided by sedgwick in a prior question, assuming it is well designed and has its purpose.
Because of the catastrophic performance of reading in Company data I have skipped that part, and use Excel formulas instead - much faster ... Also I've changed the regex to be less restrictive.
$strpath = 'C:\Temp\EE\data1.xlsx'

$xls = New-Object -ComObject Excel.Application
$xls.Visible = $true
$wb   = $xls.Workbooks.Open($strPath)

$wb.Sheets.Item('ED').UsedRange.Rows | select -Skip 1 | ? { $_.Cells.Item(1) } |
  % {
    $actRow = $_.Cells
    $actRow.Item(2).Value2.ToLower() -match "^(?<last>[\w\s-]+),( (?<middle>[\w]+))? (?<first>[\w-]+)( (?<code>[\w()]+))?$" | Out-Null
    $first, $middle, $middle2, $last = $matches["first", "middle", "middle2", "last"]  -replace "[\W]","."
    $code    = $matches["code"]   -replace "[()]"

    if ($middle)  { $middle = "." + $middle }
    if ($code)    { $code = "." + $code }
   
    $actRow.Item(4).Formula = " ""{0}{1}{2}.{3}{4}@"" & VLOOKUP(C{5}; Company!A:B; 2; false)" -f $first,$middle,$middle2,$last,$code,$_.Row
    $actRow.Item(4).Formula = $_.Cells.Item(4).Value2
}

$wb.Save()
$xls.Quit()

Open in new window

Avatar of Mandy_

ASKER

hi, thank you so much for all your help. unfortunately the sheet changed
 

"barnabas.tamas.nagy@" & VLOOKUP(C25; Company!A:B; 2; false)

the excel sheet has been changed by gov pls see below

User generated imageUser generated image

pls see the data file
data-test.xlsx
$strpath = 'C:\Temp\EE\data-test.xlsx'

$xls = New-Object -ComObject Excel.Application
$xls.Visible = $true
$wb   = $xls.Workbooks.Open($strPath)

$wb.Sheets.Item('EDM').UsedRange.Rows | select -Skip 1 | ? { $_.Cells.Item(1).Value2 } |
  % {
    $actRow = $_.Cells
    $actRow.Item(2).Value2.ToLower() -match "^(?<last>[\w\s-]+),( (?<middle>[\w]+))? (?<first>[\w-]+)( (?<code>[\w()]+))?$" | Out-Null
    $first, $middle, $middle2, $last = $matches["first", "middle", "middle2", "last"]  -replace "[\W]","."
    $code    = $matches["code"]   -replace "[()]"

    if ($middle)  { $middle = "." + $middle }
    if ($code)    { $code = "." + $code }
   
    $actRow.Item(13).Formula = "= ""{0}{1}{2}.{3}{4}@"" & VLOOKUP(C{5}; Company!A:B; 2; false)" -f $first,$middle,$middle2,$last,$code,$_.Row
    $actRow.Item(13).Formula = $_.Cells.Item(13).Value2
}

$wb.Save()
$xls.Quit()

Open in new window

Avatar of Mandy_

ASKER

@qlemo

thank you so much for your help. with the example file attached above i'm getting
in column smtp (13) as result:  -2146826259
I'm sorry but still not working. I used different excel versions but cannot find the error
That integer results from an error, probably "#NAME", which code is then translated into a numeric value by my line 18. Comment or remove that line, or set a breakpoint on it if in ISE / PowerGUI, to see the real issue.

Oh, I see a potential issue - Excel Automation is too silly to properly translate between local and generic function syntax in Excel formulas. I have a "foreign" Excel (German), which uses different function names and delimiters, even though I used the non-localized "Formula" property (vs. "FormulaLocal"). I really hate that. Back to the formula, line 17 should sound:
    $actRow.Item(13).Formula = "= ""{0}{1}{2}.{3}{4}@"" & VLOOKUP(C{5}, Company!A:B, 2, false)" -f $first,$middle,$middle2,$last,$code,$_.Row

Open in new window

The change is that I now use commas instead of semicolons for separating parameters.
What source file are you using?  EDM ??

If I run your code from the supplied data1.xlsx  after changing EDM to ED
http:#39324308
User generated imageqlemo-out-data1.xlsx
No, I'm using the new sheet EDM of 'data-test.xlsx'
Avatar of Mandy_

ASKER

hi folks,

I'm very sorry for this inconvenience.

I replaced that line 17 and set ISE Breakpoints and i'm getting this


"Exception when setting "Formula": "Exception from HRESULT: 0x800A03EC""In C: \ Users \ Administrator \ Desktop \ smtp2excel2.ps1: 17 characters: 5
+ $ ActRow.Item (13). Formula = "=" "{0} {1} {2}. {3} {4} @" "& VLOOKUP (C {5} COMPANY! ...+ CategoryInfo          : NotSpecified: (:) [], SetValueInvocationException
    + FullyQualifiedErrorId : CatchFromBaseAdapterSetValueTI


The script is running through but no result. I replaced the local names of the excel
version but also no result.

@ve3ofa
unfortunately your code not resolves the correct company codes in worksheet "company"
There are additional spaces in the error message for Formula - please check that. The HRESULT is probably pointing towards a syntax error in the formula string.
Avatar of Mandy_

ASKER

hi

i checked the spaces already. What you think could be wrong in the formula string?
I've really no idea :(
That is, the spaces aren't there? It is important that the line in http:#a39331609 is used exactly as is. The error message shows e.g. a space and no comma in
  VLOOKUP (C {5} COMPANY! ...
which should be
  VLOOKUP (C{5}, COMPANY!...

Further, is your Excel US or different locale? If different, you might have a different parameter separator than comma.
Avatar of Mandy_

ASKER

hi qlemo,

i changed parameter already to ";" or ",".  My excel is a different locale german version.
How should i change the windows separator settings?
Maybe i could resolve that problem in excel. Are you also an excel expert?
ASKER CERTIFIED SOLUTION
Avatar of Qlemo
Qlemo
Flag of Germany image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Avatar of Mandy_

ASKER

again excellent work. thank you so much