Read xml file and lookup data in csv

Hello,

I have two files, one xml and one csv file.  The .csv file has two columns, the first column containing the value from the xml and the second column containing a value needed to update the .xml.  The ask - to read the xml file for the specific value, compare that value to the data in the first column of the csv file.  If there is a match, take the value in column B and update the .xml.  If no match, update the xml with a generic value.  The csv file is comma delimited.    There could be multiple xml files in the subdirectories that need to be read.

Example files attached.  

The XML - <testdata TestDataToLookup="test2" random="data" TestDataToInsertFromCSV="" random="data"/>

test2 would be looked up in the csv file and column 2 would be inserted in between "" of TestDataToInsertFromCSV

Assuming the file would start with something like:
$path_search = "c:\test\ReferenceFile.csv"
$path_target = "c:\test\" - there could be multiple xml files in different subdirectories to be scanned.

Get-Content c:\LookupFile.xml

$LookupFile = Import-CSV "C:\ReferenceFile.csv"

Very new to powershell.  Any help in the right direction would be appreciated.

Thank you
test.csv
test.xml
cbonesAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

aikimarkCommented:
What would be the result of what you want?  Please post an "after" version.
cbonesAuthor Commented:
Hi,

The xml would be:   <testdata TestDataToLookup="test2" random="data" TestDataToInsertFromCSV="xml2" random="data"/>

csv has test2,xml2

xml2 would fill in TestDataToInsertFromCSV="xml2"

Thank you
aikimarkCommented:
So, only a single update?
Simplify Active Directory Administration

Administration of Active Directory does not have to be hard.  Too often what should be a simple task is made more difficult than it needs to be.The solution?  Hyena from SystemTools Software.  With ease-of-use as well as powerful importing and bulk updating capabilities.

cbonesAuthor Commented:
Only a single update to the file.   There could be multiple xml files in the subdirectories but only one update per file.
aikimarkCommented:
This isn't the only way to do this, however using a hash table will be very fast if you have a lot of XML files.
cd "c:\users\mark\downloads"

Import-CSV "Q_29119161.csv" -Header FindText,ReplaceWith | 
                % -begin{$lookupfile = @{}} `
                  -process{$lookupfile[$_.FindText] = $_.ReplaceWith }
$x = Get-Content "Q_29119161.xml"

if ($x -match '^(.+TestDataToLookup="([^"]+)" random="data" TestDataToInsertFromCSV=")[^"]*("[^$]+)$') {
    $Fnd = $matches[2]
    if ($LookupFile.ContainsKey($fnd)) {
        set-content -path "Q_29119161.xml" -Value ($matches[1]+$lookupfile[$fnd]+$matches[3])
    }
}

Open in new window

cbonesAuthor Commented:
Hi Mark,

I tried running the script.  It runs through without any errors but it didn't update the field.  I'm working on it this morning and will get back to you.

Thank you
cbonesAuthor Commented:
The test file has less fields in it than the real file I'm working with.  I'm sure that is why it updates on the test file and not on the one I'm using.  How can I expand it out to look at the six field for a match?  

Thank you
aikimarkCommented:
Post a representative sample of the XML that needs to be changed.
cbonesAuthor Commented:
Hi,

Attached is the exact xml file with dummy information filled in.  Anything inside " " will be different in each file.  The updated field we are looking for in the xml is surgeon="" which will come from the .csv.

Thank you.
Sample.xml
oBdACommented:
XML should be handled as such:
$caseCsvFile = 'C:\Temp\case.csv'
$caseXmlFile = 'C:\Temp\case.xml'
$defaultSurgeon = 'generic'
$patient2Surgeon = @{}
Import-Csv -Path $caseCsvFile -Header PatientId, Surgeon | ForEach-Object {$patient2Surgeon[$_.PatientId] = $_.Surgeon}
$xml = [xml](Get-Content -Path $caseXmlFile)
$patientInfo = $xml.SelectSingleNode('Case/PatientInfo')
$patientId = $patientInfo.GetAttribute('patientId')
If ($patient2Surgeon.ContainsKey($patientId)) {
	$surgeon = $patient2Surgeon[$patientId]
} Else {
	$surgeon = $defaultSurgeon
}
$patientInfo.SetAttribute('surgeon', $surgeon)
$xml.Save($caseXmlFile)

Open in new window

cbonesAuthor Commented:
Thank you.  This worked using the test csv and xml.  Can this be made to loop through subdirectories to look for all files case.xml?
oBdACommented:
This processes a complete directory. To process subdirectories as well, add -Recurse to the Get-ChildItem options in line 7.
$caseCsvFile = 'C:\Temp\case.csv'
$defaultSurgeon = 'generic'
$xmlPath = 'C:\Temp'

$patient2Surgeon = @{}
Import-Csv -Path $caseCsvFile -Header PatientId, Surgeon | ForEach-Object {$patient2Surgeon[$_.PatientId] = $_.Surgeon}
Get-ChildItem -Path $xmlPath -Filter *.xml | ForEach-Object {
	Write-Host "Processing '$($_.Name)': " -NoNewline
	$xml = [xml](Get-Content -Path $_.FullName)
	If ($patientInfo = $xml.SelectSingleNode('Case/PatientInfo')) {
		$patientId = $patientInfo.GetAttribute('patientId')
		If ($patient2Surgeon.ContainsKey($patientId)) {
			$surgeon = $patient2Surgeon[$patientId]
		} Else {
			$surgeon = $defaultSurgeon
		}
		Write-Host "$($patientId) --> $($surgeon)"
		$patientInfo.SetAttribute('surgeon', $surgeon)
		$xml.Save($_.FullName)
	} Else {
		Write-Warning "Node PatientInfo not found: '$($_.FullName)'"
	}
}

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
cbonesAuthor Commented:
Thank you.  It does loop through the subdirectories but it keeps returning "Warning: Node Patient Info Not Found: C:\test\test.xml.

There is a match available in the csv.
oBdACommented:
That means that the node PatientInfo can't be found in the xml file it found, so it does nothing.
This is the XML you posted, which processed fine here.
<?xml version="1.0" encoding="utf-8"?>
<Case fromMWLServer="0" caseId="1.1.1.1.1.1.1.1.1.1.1.1">
  <PatientInfo patientId="test2" patientFirstName="FirstName" patientMiddleInitial="" patientLastName="LastName" accession="1234567" surgeon="" referrer="" speciality="" procedure="NAME" station="" department="" hospital="NAME" procedureDate="01/01/1901 " birthDate="" gender="U" typeOfSurgery="" surgicalDetails="" studyinstanceuid="1.1.1.1.1.1.1.1.1.1.1.1.1.1" ICD10="" patientEmail="" surgeonID="1111-11111-11111-11111" saveDicomVideo="0" />
  <DataClips>
    <Clip type="video" path="NAME.mp4" textAnnotation="" startTime="" endTime="" DICOMRetrieved="No" clipSelected="1" ChannelType="primary" sopInstanceId="" StorageCommitted="No" />
  </DataClips>
</Case>

Open in new window

If you changed the XML layout in any way before posting, then the script will fail - it expects a node with XPath "Case/PatientInfo".
Note that XML is cAsE sEnSiTiVe.
cbonesAuthor Commented:
Thank you.  I'm not sure what happened.  I copied the script again, resaved it and it worked.  Question: the second column in the csv will contain LastName,FirstName.  The script will fail because of the commas.  Can I specify the delimiter to pipe using -Delimiter "|" in order to correct that?
oBdACommented:
Either that (or any other delimiter that is guaranteed to not be a part of a field), or enclose the fields in double quotes:
"test1","Last1, First1"
"test2","Last2, First2"

Open in new window

cbonesAuthor Commented:
I tried using the quotes and also tried changing the delimeter to pipe in line 6 to

Import-Csv -Delimiter "|" -Path  $caseCsvFile -Header PatientId, Surgeon | ForEach-Object {$patient2Surgeon[$_.PatientId] = $_.Surgeon}

but I receive a Method invocation failed.  Is there more than -Delimeter "|"  to be added?

Thank you
cbonesAuthor Commented:
Sorry to correct my above note, the "" are working with a comma delimeter.  The pipe doesn't work.    Is the pipe delimeter in the wrong location?

Thank you
oBdACommented:
Named arguments can appear in any order, PS doesn't care.
This works just as expected in combination with the "-Delimiter '|'" argument:
test1|Last1, First1
test2|Last2, First2

Open in new window

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.