Link to home
Start Free TrialLog in
Avatar of FDiskWizard
FDiskWizardFlag for United States of America

asked on

Powershell: Faster comparison of 2 large strings/files?

I have a working script, it is just way too slow...
The script compares a large text file of SMTP addresses against addresses in Active Directory.

I have done it two ways:
1st method: do a query directly against AD for each email address.
2nd: I put all of the AD ProxyAddresses, returned in a query, into a string.
Then I compare each address from the file to those values from AD.
Below is the 2nd method, which seems to be about the same speed as using AD directly for each check.

I bet you Powershell gurus know how to speed this up???


 
$Domain = "DC=YOUR,DC=local" # Root context of AD domain.
$SMTPDomain = "Sample.com"  # Looking for users@Sample.com

$SMTPList = get-content "C:\Script Files\TestAddresses.txt" 

$Filter = "(proxyaddresses=*smith@$SMTPDomain*)"     #### Limited to *smith...* for testing ###
$ProxyAddresses = get-qadobject -DontUseDefaultIncludedProperties -Includedproperties proxyaddresses -SearchRoot $domain -Sizelimit 0 -SearchScope Subtree -ldapFilter $Filter | Select proxyaddresses

ForEach ($email in $SMTPList) {
  	$Found = $null
	$Found = ($ProxyAddresses | Out-String | select-string -pattern $Email)
	
	If ($Found -eq $null)
	{
	 $ListNotFound=$ListNotfound+"`n"+$Email
	 Write-Host "Didn't Find: " $Email 
	} 
	Else 
	 { 
	 $ListFound=$ListFound+"`n"+$Email
	 Write-Host "Was Found: " $Email
	 }
}

$ListNotFound | out-file -filepath "C:\Temp\NotFound.txt"
$ListFound | out-file -filepath "C:\Temp\Found.txt"

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of chrismerritt
chrismerritt

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of FDiskWizard

ASKER

Thanks. Works great.
After some testing, I guess your method is faster because of the ADSI methods.
I had to modify to look for all mail-enabled objects, not just users.

For me, from a PC, it is running in about 2.5 minutes.
100,000+ objects and checking for 5800+ email addresses from text file.

I had to modify a little, to look for all mail-enabled objects, not just users. Speed did not seem to change really. (objectClass=*)

You also pointed out something else I was unaware of: that you could query proxyaddresses with =smtp:....  Like '(proxyaddresses=smtp:user@acme.com). I had always used *'s such as (proxyaddresses=*user@acme.com*). Although I knew this was a multi-valued field, I didn't realize a value could be queried like that.

Excellent.

I would still be curious if there was a method to make the search better in memory.
After importing all data from text file and AD.
But the AD search seems to work pretty fast. It just seems to me that a memory search SHOULD be faster.

Good stuff. Thanks!
Great solution for comparing a text file of values to values that are in Active Directory.
Avatar of chrismerritt
chrismerritt

It has to retrieve the values from AD so I expect that is as quick as it will get more or less.

You can mess around with wrapping your commands/sections of your script in Measure-Command to benchmark speeds. i.e:

Measure-Command {
Your Script Line 1
Your Script Line 2
}

Maybe you can find the biggest bottleneck and look at reducing those sections.

I suspect that if you dumped out the whole of your AD fields into memory it would actually perform slower. I find if I get 60000 users worth of data from AD and enumerate the object count with $SearchResults.Count that it takes ages as it has to step through the list, and it uses up a lot of system RAM to boot. More than a couple of minutes just for counting the number of objects.

Subsequently comparing a list of 5800 objects against the list of 60000 objects would have to scan the list many many times, singular comparisons which perform quickly per check would seem logically quicker to me.

The only other thing I can think of is to check that you have indexed the proxy addresses field, check on Google for indexing AD fields, may speed up LDAP queries.