Link to home
Start Free TrialLog in
Avatar of simon vitali
simon vitali

asked on

Powershell: Using a select-xml command to get node value and tags from a .xml file

HI this is a continuation of thread from before
old thread

the conditions for this project has changed a little rendering the script that Michelangelo had provided no longer useful for my cause although it worked for the previous purpose

heres what changed:
the files in questions are XML format (but are not named .xml they are .901 files)
i need to remove a whole node from the *.xml files ,  including the tag

i.e.
 <errsql yadadayda > 
random sql error the statement has terminated
</errsql>

this whole line need to be removed form the file

here is a snippet of the xml file:

<chgsql ts="07:01:58.971" op="O" jb="ADD" tb="CLL_TAB" sc="BATCH"> <fld fd="F1000" af="PAL"/>
<fld fd="F1148" af="43000135163"/>
<idx fd="F1577" af="5167599565"/>
<idx fd="F1578" af="LINK"/>
<fld fd="F1001" af="1"/>
<fld fd="F902" af="00003814"/>
</chgsql>
<errsql ts="07:01:58.986" op="O" lv="50" ob="">
<sql>INSERT INTO CLL_TAB [F1000,F1148,F1577,F1578,F1001,F902] VALUES [:F1000,:F1148,:F1577,:F1578,:F1001,:F902] </sql>
<prm>PAL,43000135163,5167599565,LINK,1,00003814,</prm>
</errsql>
<chgsql ts="07:01:58.986" op="O" jb="ADD" tb="CLL_TAB" sc="BATCH">
<fld fd="F1000" af="PAL"/>
<fld fd="F1148" af="43000135164"/>
<idx fd="F1577" af="5166282554"/>
<idx fd="F1578" af="LINK"/>
<fld fd="F1001" af="1"/>
<fld fd="F902" af="00003814"/>
</chgsql>
<errsql ts="07:01:58.986" op="O" lv="50" ob="">
<sql>INSERT INTO CLL_TAB [F1000,F1148,F1577,F1578,F1001,F902] VALUES [:F1000,:F1148,:F1577,:F1578,:F1001,:F902] </sql>
<prm>PAL,43000135164,5166282554,LINK,1,00003814,</prm>
</errsql>


here is what i have as a command thats not working so far which is Michealangelo's script modified to search using xpath

$root = 'C:\storeman\Office\ZY999901\*.*'
# set exclusions: none 
$exclude = $null
$SearchString = select-xml -Path 'C:\storeman\Office\ZY999901\*.*' -XPath //errsql | Select-Object -ExpandProperty node
# find all files which have $searchstring inside 
$object = Get-ChildItem -Path $root -Recurse -exclude $exclude | Select-String -pattern $SearchString  -list # list: first occurence
# delete all lines which contain $searchstring and overwrite source file
$object | % { (get-content  $_.path) | select-string -pattern $SearchString -notmatch | Set-Content $_.path}

Open in new window



my xpath command works on its own just fine except it doesn't include the <errsql> </errsql> tags in results , i could live with this

when i run the command in powershell ISE it runs for a very long time and doesn't return and results or errors but it doesn't do anything
pls help, thank you
Avatar of oBdA
oBdA

Don't treat XML as text.
This will remove all <errsql> nodes from all .901 documents that in the folder specified.
Matching documents will be overwritten without warning.
As usual, try it on a test folder first ...
$root = 'C:\Temp\29157334'
$filter = '*.901'
Get-ChildItem -Path $root -Filter $filter |
	Select-String -Pattern '<errsql' -CaseSensitive -List |
	ForEach-Object {
		Write-Host "Processing '$($_.Path)'"
		Try {
			$xml = [xml](Get-Content -Path $_.Path)
			$xml.SelectNodes('//errsql') | ForEach-Object {
				[void]$_.ParentNode.RemoveChild($_)
			}
			$xml.Save($_.Path)
		} Catch {
			$_ | Write-Error
		}
	}

Open in new window

Avatar of simon vitali

ASKER

Hi Obda
thank you for that, it worked great! my only issue is that it takes a heck of a long time to run on one 434MB file(2hr50min) the file did get down to 175MB after the script ran.  Is there anyway with using a different file encoding that the script run time could be dramtically sped up?
I'm afraid I somewhat doubt the "dramatically".
Can you attach a zipped sample file of, say, ~1MB expanded size (a full file, not a snippet), or do they contain sensitive information?
no there is no sensitive data, i will upload that
thank you
here is the data file in question, a bit larger than i hoped but i was breaking the formatting when try to go smaller, sorry about that
90727999.901
ASKER CERTIFIED SOLUTION
Avatar of oBdA
oBdA

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Well i would say you missed the mark on one thing!, it is working dramatically faster :)
Processed one file of 434MB in about 3 minutes saving alot of time.
I saw the differences in your script and while im not that powershell techsavvy yet to fully understand I continue to learn in hopes of one day being able to write scripts half this good.  Thank you for your invaluable help it is greatly appreciated.
Cheers
Nailed it