simon vitali
asked on
Powershell: Using a select-xml command to get node value and tags from a .xml file
HI this is a continuation of thread from before
old thread
the conditions for this project has changed a little rendering the script that Michelangelo had provided no longer useful for my cause although it worked for the previous purpose
heres what changed:
the files in questions are XML format (but are not named .xml they are .901 files)
i need to remove a whole node from the *.xml files , including the tag
i.e.
<errsql yadadayda >
random sql error the statement has terminated
</errsql>
this whole line need to be removed form the file
here is a snippet of the xml file:
here is what i have as a command thats not working so far which is Michealangelo's script modified to search using xpath
my xpath command works on its own just fine except it doesn't include the <errsql> </errsql> tags in results , i could live with this
when i run the command in powershell ISE it runs for a very long time and doesn't return and results or errors but it doesn't do anything
pls help, thank you
old thread
the conditions for this project has changed a little rendering the script that Michelangelo had provided no longer useful for my cause although it worked for the previous purpose
heres what changed:
the files in questions are XML format (but are not named .xml they are .901 files)
i need to remove a whole node from the *.xml files , including the tag
i.e.
<errsql yadadayda >
random sql error the statement has terminated
</errsql>
this whole line need to be removed form the file
here is a snippet of the xml file:
<chgsql ts="07:01:58.971" op="O" jb="ADD" tb="CLL_TAB" sc="BATCH"> <fld fd="F1000" af="PAL"/>
<fld fd="F1148" af="43000135163"/>
<idx fd="F1577" af="5167599565"/>
<idx fd="F1578" af="LINK"/>
<fld fd="F1001" af="1"/>
<fld fd="F902" af="00003814"/>
</chgsql>
<errsql ts="07:01:58.986" op="O" lv="50" ob="">
<sql>INSERT INTO CLL_TAB [F1000,F1148,F1577,F1578,F1001,F902] VALUES [:F1000,:F1148,:F1577,:F15 78,:F1001, :F902] </sql>
<prm>PAL,43000135163,5167599565,LINK ,1,0000381 4,</prm>
</errsql>
<chgsql ts="07:01:58.986" op="O" jb="ADD" tb="CLL_TAB" sc="BATCH">
<fld fd="F1000" af="PAL"/>
<fld fd="F1148" af="43000135164"/>
<idx fd="F1577" af="5166282554"/>
<idx fd="F1578" af="LINK"/>
<fld fd="F1001" af="1"/>
<fld fd="F902" af="00003814"/>
</chgsql>
<errsql ts="07:01:58.986" op="O" lv="50" ob="">
<sql>INSERT INTO CLL_TAB [F1000,F1148,F1577,F1578,F1001,F902] VALUES [:F1000,:F1148,:F1577,:F15 78,:F1001, :F902] </sql>
<prm>PAL,43000135164,5166282554,LINK ,1,0000381 4,</prm>
</errsql>
here is what i have as a command thats not working so far which is Michealangelo's script modified to search using xpath
$root = 'C:\storeman\Office\ZY999901\*.*'
# set exclusions: none
$exclude = $null
$SearchString = select-xml -Path 'C:\storeman\Office\ZY999901\*.*' -XPath //errsql | Select-Object -ExpandProperty node
# find all files which have $searchstring inside
$object = Get-ChildItem -Path $root -Recurse -exclude $exclude | Select-String -pattern $SearchString -list # list: first occurence
# delete all lines which contain $searchstring and overwrite source file
$object | % { (get-content $_.path) | select-string -pattern $SearchString -notmatch | Set-Content $_.path}
my xpath command works on its own just fine except it doesn't include the <errsql> </errsql> tags in results , i could live with this
when i run the command in powershell ISE it runs for a very long time and doesn't return and results or errors but it doesn't do anything
pls help, thank you
ASKER
Hi Obda
thank you for that, it worked great! my only issue is that it takes a heck of a long time to run on one 434MB file(2hr50min) the file did get down to 175MB after the script ran. Is there anyway with using a different file encoding that the script run time could be dramtically sped up?
thank you for that, it worked great! my only issue is that it takes a heck of a long time to run on one 434MB file(2hr50min) the file did get down to 175MB after the script ran. Is there anyway with using a different file encoding that the script run time could be dramtically sped up?
I'm afraid I somewhat doubt the "dramatically".
Can you attach a zipped sample file of, say, ~1MB expanded size (a full file, not a snippet), or do they contain sensitive information?
Can you attach a zipped sample file of, say, ~1MB expanded size (a full file, not a snippet), or do they contain sensitive information?
ASKER
no there is no sensitive data, i will upload that
thank you
thank you
ASKER
here is the data file in question, a bit larger than i hoped but i was breaking the formatting when try to go smaller, sorry about that
90727999.901
90727999.901
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Well i would say you missed the mark on one thing!, it is working dramatically faster :)
Processed one file of 434MB in about 3 minutes saving alot of time.
I saw the differences in your script and while im not that powershell techsavvy yet to fully understand I continue to learn in hopes of one day being able to write scripts half this good. Thank you for your invaluable help it is greatly appreciated.
Cheers
Processed one file of 434MB in about 3 minutes saving alot of time.
I saw the differences in your script and while im not that powershell techsavvy yet to fully understand I continue to learn in hopes of one day being able to write scripts half this good. Thank you for your invaluable help it is greatly appreciated.
Cheers
ASKER
Nailed it
This will remove all <errsql> nodes from all .901 documents that in the folder specified.
Matching documents will be overwritten without warning.
As usual, try it on a test folder first ...
Open in new window