Link to home
Start Free TrialLog in
Avatar of bndit
bndit

asked on

Powershell: How to compare AD user accounts to CSV record row

I have a script that provisions AD user accounts based on a daily user export from HR. This export has your typical information about the employee (location, title, employeenumber, etc.)

Currently, I update every single record in AD even if there was no change to the employee record. However, I'm trying to be more efficient and ONLY process new employees and employees that have had changes to their information (title, location, last name change, etc)

I think I can use hashtables to identify the new employees something like this:

cls
$ADUsers = Get-QADuser -SearchRoot "mydomain" -Sizelimit 0 | Group-Object -Property UNIQUEID -AsHashTable
$Employees = Import-Csv -Path C:\temp\employees.csv | Group-Object -Property UNIQUE_ID -AsHashTable
$NewEmployees = New-Object System.Collections.ArrayList
$Employees.GetEnumerator() | % {
	if(!$ADUsers[$_.Key])
	{
		$NewEmployees.Add($_.Key) | Out-Null
	} 
}

Open in new window


However, aside from checking every single row against the data that you're monitoring in the AD user account (i.e. you want to update the AD user account when either of 3 fields has changed 1) Title 2) last name, or 3) location), I'm not sure how I can accomplish this?  Do you have any ideas? suggestions? Something I was thinking about was computing a hash value (not even sure if this is available in poweshell) for the row in the CSV and compute another hash for the AD user data (you'd have to select the right properties etc), and then do a compare (Compare-Object?)? Would the use of a database table help in this case? Load the existing AD user data into one table and compare it to the data in second table, which contains AD user data from the last run and only get the records that are different (Select query?)?

Anyway, thanks for your help.
Avatar of footech
footech
Flag of United States of America image

I think the most straight-forward way of doing this is with Compare-Object.  The retrieval from AD and the lookup from the CSV need to have the same property names for any that are compared, so if for instance the headers of the CSV didn't match the property names as retrieved with AD query, you could use Select-Object with calculated properties to adjust for this.  But assuming all the property names line up, here's a basic version of what I was thinking.
$adusers = Get-ADUser -filter * -properties Department,Title |
 Select sAMAccountName,Surname,givenName,Title,Department  #this Select statement isn't necessarily needed, but could be used to filter unwanted properties or use calculated properties
$list = Import-CSV employees.csv
Compare-Object $adusers $list -property sAMAccountName,surName,givenName,Title,Department -passthru |
 Where {$_.SideIndicator -eq "=>"} |
 ForEach { `
 Set-ADUser -identity $_.sAMAccountName `
   -surName $_.surName `
   -givenName $_.givenName `
   -Title $_.Title `
   -Department $_.Department `
 }

Open in new window


Another method I've seen is to use a ForEach inside of a ForEach, looping through the ADusers query and searching for a match (the 2nd loop) from the file, then doing a compare.

EDIT: Noticed a bug in the script, made the correction.
Avatar of bndit
bndit

ASKER

Sorry for the delayed reply.

@footech - I think this could work....however, I have a questions.....Ok, without going into the details of the script, here's piece of code that relates to the compare-object piece you point out
#Split csv employee file into "Terminated" and "Non-Terminated" employee 
#records. "Employment_Status" has two values; 1) Active and 2) Terminated.
#Get "Non-terminated" employees.
$RawNonTerminatedEmployeeRecords = @(Import-CSV -Path $HR_Employee_Feed_File | 
? {$_.Employment_Status -ne "Terminated"})
#I omitted this function on purpose.
$NonTerminatedEmployeeRecords = Load-UniqueValueToFile $RawNonTerminatedEmployeeRecords
#
#Get "Terminated" employees.
$TerminatedEmployeeRecords = @(Import-CSV -Path $HR_Employee_Feed_File | 
? {$_.Employment_Status -eq "Terminated"})
#
#Determine new employees.
$NewEmployee_Records = @($NonTerminatedEmployeeRecords | % {
	#Determine if the employee's UNIQUEID exists in AD.
	if(!$ADUsers[$_.UNIQUE_ID])
	{
		#If not found, the employee must be new.
		$_
	}
} | Select-Object *)
#
#Check each employee record information against its corresponding AD account to
#determine modified employees.
$EmployeeRecordsToUpdate = @($NonTerminatedEmployeeRecords | % {
	#reset value
	$EmployeeHasChanged = $null	
	#Determine if the employee's UNIQUEID exists in AD.
	if($ADUsers[$_.UNIQUE_ID])
	{
		#Build a custom object for comparison
		$HR_Record_Obj = New-Object PSObject -Property @{
			TITLE = (Validate-Data $_.TITLE)
			EMPLOYEE_NUMBER = (Validate-Data $_.EMPLOYEE_NUMBER)
		}
		
		#Convert AD attributes to HR column names
		$ADUser_Obj = $ADUsers[$_.UNIQUE_ID] | 
		Select-Object @{Label="TITLE";Expression={$_.Title}},`
		@{Label="EMPLOYEE_NUMBER";Expression={$_.EmployeeNumber}}
				
		#Compare employee record in CSV file and AD user account.
		# Reference object is the employee's HR employee record, and the 
		# Difference objct is te AD use account. '<=' indicates that a difference
		# was found in the HR employee record.
		$EmployeeHasChanged = Compare-Object -ReferenceObject $HR_Record_Obj -DifferenceObject $ADUser_Obj -Property TITLE,EMPLOYEE_NUMBER |
		Where-Object {$_.SideIndicator -eq "<="}
		
		#Get employee record to update.
		if ($EmployeeHasChanged -ne $null)
		{
			$_
		}
	}
} | Select-Object *)
#Merge new and terminated employees.
$EmployeeRecordsToProcess = $TerminatedEmployeeRecords + $NewEmployee_Records
$EmployeeRecordsToProcess = $EmployeeRecordsToProcess + $EmployeeRecordsToUpdate

Open in new window


So basically, I build two small custom objects with the properties I want to compare (HR Record data and AD user account data).  If these two custom objects are different on any of the specified properties, I then grab the entire row from the CSV file. Is this the right approach?

Notes:
1) I've omitted the AD query
2) UniqueId is a custom AD attribute that is valid and populated.
3) I've omitted a function that detects when there's a duplicate record based on UNIQUEID, and returns the record with the oldest hire date (column value in the file)

In terms of grabbing entire CSV rows when processing CSV files like I am in this piece of code, I'd like to make sure that I'm doing the right thing.  Which of the below loops should I be using? the one that generates the custom object or array? (Save the "depends on what you're doing" answer =)   the goal is to end up with *only* the records that I need to process in an array or custom object (thus my question) so that the rest of the script can process them correctly.

# This generates a custom object
$File = Import-Csv C:\temp\data.csv

$NewRecords = $File | % {
     #some condition
     if ($_.uniqueid -ne $null) {
         $_
      }
} | Select-Object *

#While this generates an array
$File = Import-Csv C:\temp\data.csv

$NewRecords = @($File | % {
     #some condition
     if ($_.uniqueid -ne $null) {
         $_
      }
} | Select-Object *)

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of footech
footech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bndit

ASKER

@footech - thx so much for the time and the pointers. You were right, I didnt need the " .. | Select * " in this case so I've taken that out. Also, I think that the example that I used where I used the UNIQUEID -eq $null was a bad one since I'm using UNIQUEID to index my AD accounts. Thus, I find AD user accounts based on UNIQUEID.
Also, I liked your approach to consolidate lines 14-57, but it doesnt work due to the empty cells in the csv file. Not sure if you've encountered this or not, but say that you have a CSV column for middle initials; comparing that empty cell against an empty (Null) AD attribute for middle initials, the comparison will yield False. Thus, I have to convert this empty csv cells to $null via the Validate-Data function, which uses [System.DBNull]::Value. Here's the script I came up with:
function Validate-Data
{
	Param
	(
		[Parameter(Mandatory=$true,
					Position=0)]
		[AllowEmptyString()]
		[string]
		$strData
	)
	Process
	{
		if($strData.Trim() -ne $script:DBNull)
		{
			return $strData
		} else {
			return $null
		}
	}
}#END FUNCTION:  Validate-Data
#-------------------------------------------------------------------------------

$filter = "(&(objectCategory=person)(objectclass=user)(uniqueid=*))"

$ADUsers = Get-QADUser -SearchRoot "dc=mydomain,dc=local" `
-Enabled `
-DontUseDefaultIncludedProperties `
-IncludedProperties UniqueID,EmployeeNumber,JobCodeId,FacilityId,Area,Market,Title `
-LdapFilter $filter `
-Service "LDAPSERVER.mydomain.local" `
-SizeLimit 0 | Group-Object -Property UNIQUEID -AsHashTable
$script:DBNull = [System.DBNull]::Value

$HR_Employee_Feed_File = "c:\EmployeeData.csv"

#Get "Non-terminated" employees.
$NonTerminatedEmployeeRecords = @(Import-CSV -Path $HR_Employee_Feed_File | 
? {$_.Employment_Status -ne "Terminated"})

#Get "Terminated" employees.
$TerminatedEmployeeRecords = @(Import-CSV -Path $HR_Employee_Feed_File | 
? {$_.Employment_Status -eq "Terminated"})


#Determine if the employee's UNIQUEID exists in AD. If not found, the employee 
#must be new.
$NewEmployeeRecords = @($NonTerminatedEmployeeRecords | % {	
	if(!$ADUsers[$_.UNIQUE_ID]) { $_  } 
	})

#Check each employee record information against its corresponding AD account to
#determine modified employees.
$EmployeeRecordsToUpdate = @($NonTerminatedEmployeeRecords | % {
	#Determine if the employee's UNIQUEID exists in AD.
	if($ADUsers[$_.UNIQUE_ID])
	{
		#Expand AD user account information from hashtable.
		$ADUser = $ADUsers[$_.UNIQUE_ID] | Select-Object Title,EmployeeNumber,`
		JobCodeId,FacilityId,Area,Market,Description,FirstName,LastName,`
		StreetAddress,City,PostalCode,StateOrProvince
		
		$EmployeeAddress = if ((Validate-Data $_.LINE_TWO) -ne $null)
		{
			(Validate-Data $_.LINE_ONE) + " " + (Validate-Data $_.LINE_TWO)
		} else {
			(Validate-Data $_.LINE_ONE)
		}
		#Compare employee record in CSV file and AD user account.
		if (((Validate-Data $_.TITLE) -ne $ADUser.Title) -or `
		((Validate-Data $_.TITLE) -ne $ADUser.Description) -or `
		((Validate-Data $_.FIRST_NAME) -ne $ADUser.FirstName) -or `
		((Validate-Data $_.LAST_NAME) -ne $ADUser.LastName) -or `
		((Validate-Data $_.EMPLOYEE_NUMBER) -ne $ADUser.EmployeeNumber) -or `
		((Validate-Data $_.JOB_CODE) -ne $ADUser.JobCodeId) -or `
		((Validate-Data $_.FACILITY_ID) -ne $ADUser.FacilityId) -or `
		((Validate-Data $_.AREA_NAME) -ne $ADUser.Area) -or `
		((Validate-Data $_.MARKET_NAME) -ne $ADUser.Market) -or `
		($EmployeeAddress -ne $ADUser.StreetAddress) -or `
		((Validate-Data $_.CITY) -ne $ADUser.City) -or `
		((Validate-Data $_.ZIPCODE) -ne $ADUser.PostalCode) -or `
		((Validate-Data $_.STATE) -ne $ADUser.StateOrProvince))
		{ $_ }
	}
})

#Merge new and terminated employees.
$EmployeeRecordsToProcess = $TerminatedEmployeeRecords + $NewEmployeeRecords + $EmployeeRecordsToUpdate

#Get employee records, and sort them so that records with "Terminated" are at top.
$EmployeeRecordsToProcess = $EmployeeRecordsToProcess | Sort-Object EMPLOYMENT_STATUS -Descending

Open in new window


I'd be interested to know if you see any more deficiencies or areas where I can improve. Thanks!
Interesting, I hadn't noticed that before.  A CSV with a the following two lines will give completely different results for each (empty string vs. null as you pointed out).
Gamgee,Samwise,,,,,,sgamgee
Gamgee,Samwise

Open in new window

In my testing I always had all the fields filled out or missing off the end so I didn't notice.  I think I'll have to see if I can find a way to convert all those empties to nulls.

Nothing else is standing out to me.  I think you should be good.
Avatar of bndit

ASKER

I didn't know either until I got unexpected results and spent some time researching why....if you find an easier/more efficient way to convert those empty cells to nulls, I'd be interested to know....thanks again for all the help.
Avatar of bndit

ASKER

Thanks!
I finally spent a little time to create a function to help with this.  Nothing too fancy, but thought I would post it in case someone might find it helpful.
<#  This function can be used to convert the values of Properties that are
    created by the Import-CSV cmdlet when a field is empty
    This can be helpful when comparing to AD attributes that are null.
 Usage:
    $list = Import-CSV userlist.csv
    $list | ForEach { ConvertPropTo-Null $_ }
 or:
    $list | ConvertPropTo-Null
#>

function ConvertPropTo-Null
{
  [CmdletBinding()]
  Param (
    [Parameter(Mandatory=$true,
              ValueFromPipeline=$true)]
    $obj
    )

  Process
  {
    $obj |
     Get-Member -MemberType NoteProperty |
     Select -ExpandProperty Name |
     ForEach `
     {
       If ($obj.$_ -eq "")
       { $obj.$_ = $null }
     }
  }
}

Open in new window