asked on

How do I get Powershell to look through a text file and produce output to a file?

I have data with several companies that looks similar to this:

KLM*00*          *00*          *AA*DDDSSS         *45*4564564465     *5464564*1552*U*5465*000021*0*P*>~
AAA*00*          *00*          *00*123456789     *01*123456     *123456*1234*^*00123*000000123*0*P*>~
GS*PO*9988771231*0123456789*20140909*0123*123*Y*123456~
ST*555*00123456~
CDE*00*OP*12345**20140908~
PO1**15*CA*1.00**AD*123456456456*OP*123123~
PO4*10~
PO1**22*CA*10.05**DD*12313123213444*AB*012132~
DZA*D****TEST3~
PO1**10*CA*11.22**AA*99991132123231*AB*989889~
KLM*B****TEST2~
PO1**20*CA*12.00**KL*1231321332123*AB*756465~
EFG*A****TEST~
FGH*0*000000123~
KLM*00*          *00*          *AA*DDDSSS         *45*4564564465     *5464564*1552*U*5465*000021*0*P*>~
BBB*00*          *00*          *00*123456789     *01*123456     *123456*1234*^*00123*000000123*0*P*>~
GS*PO*11111776655*0123456789*20140909*0123*123*Y*123456~
ST*555*00123456~
CDE*00*OP*12345**20140908~
PO1**12*CA*1.00**AD*123456456456*OP*123123~
PO4*10~
PO1**15*CA*10.05**DD*12313123213444*AB*012132~
DZA*D****TEST3~
PO1**10*CA*11.22**AA*99991132123231*AB*989889~
KLM*B****TEST2~
PO1**20*CA*12.00**KL*1231321332123*AB*756465~
EFG*A****TEST~
FGH*0*000000123~
KLM*00*          *00*          *AA*DDDSSS         *45*4564564465     *5464564*1552*U*5465*000021*0*P*>~
CCC*00*          *00*          *00*123456789     *01*123456     *123456*1234*^*00123*000000123*0*P*>~
GS*PO*3333776655*0123456789*20140909*0123*123*Y*123456~
ST*555*00123456~
CDE*00*OP*12345**20140908~
PO1**11*CA*1.00**AD*123456456456*OP*123123~
PO4*10~
PO1**2*CA*10.05**DD*12313123213444*AB*012132~
DZA*D****TEST3~
PO1**10*CA*11.22**AA*99991132123231*AB*989889~
KLM*B****TEST2~
PO1**20*CA*12.00**KL*1231321332123*AB*756465~
EFG*A****TEST~
FGH*0*000000123~
KLM*00*          *00*          *AA*DDDSSS         *45*4564564465     *5464564*1552*U*5465*000021*0*P*>~
DDD*00*          *00*          *00*123456789     *01*123456     *123456*1234*^*00123*000000123*0*P*>~
GS*PO*87778778*0123456789*20140909*0123*123*Y*123456~
ST*555*00123456~
CDE*00*OP*12345**20140908~
PO1**12*CA*1.00**AD*123456456456*OP*123123~
PO4*10~
PO1**99*CA*10.05**DD*12313123213444*AB*012132~
DZA*D****TEST3~
PO1**10*CA*11.22**AA*99991132123231*AB*989889~
KLM*B****TEST2~
PO1**12*CA*12.00**KL*1231321332123*AB*756465~
EFG*A****TEST~
FGH*0*000000123~

Open in new window

I want Powershell to look through the different companies and output in 1 file the data for each of the companies as follows:

PONumber: 9988771231
Total: $the calculated total goes here

PONumber: 11111776655
Total: $the calculated total goes here

PONumber: 3333776655
Total: $the calculated total goes here

PONumber: 87778778
Total: $the calculated total goes here

All the above in one file.

The script I will begin with is this.

$InputFile = "C:\Data\original.txt"
$OutputFile = "NewData.txt"

[double]$Total = 0
($Input = Get-Content -Path $InputFile) | ? {$_ -match "PO1"} | % {$col=$_.split("*");$total += [double]$col[2] * [double]$col[4]}

$Totline = "Total: `$$Total"
write-host "This is Total `$$total"

GC $InputFile |?{$_ -match "(PO\*)(\d{0,10})"} | Out-Null

$poline = "PONumber: $($Matches[2])"

@"
$poline
$totline
"@ | out-file $OutputFile -Encoding UTF8

Open in new window

What modifications do I have to make to the script to display the appropriate information for all the companies it finds?

becraig

Is the expectation to calculate the total between:
GS*PO*xxx - [line output 1]
....... [This would be the total for the po on the line above]
GS*PO*xxx
....... [This would be the total for the po on the line above]
GS*PO*xxx - [line output 2]
etc

Is that what you are looking for ?

ASKER CERTIFIED SOLUTION

SubSun

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

aikimark

If you applied this regular expression pattern to the text:

(?:GS\*PO\*(\d+))|(?:PO1.*?CA\*(\d+\.\d\d))

Open in new window

Then you could iterate the matches, gathering the data you want. If the first submatch is not empty, then it is the PO Number. If the second submatch is not empty, it is a dollar amount.

E=mc2

ASKER

@aikimark - do you mean to add the text to the first script I have in my entry?

E=mc2

ASKER

@Subbun. Thanks for the script, I made some small modifications however it only seems to output data for 2 out of the 4 companies in another file I have. Why is it not picking up all the companies?
Also, I changed the script to look at the line that starts with KLM as the marker since that's where the data for each company starts.

aikimark

I was thinking about parsing the entire file with the regex pattern. It is a different approach than the one already suggested.

What does your PS script look like now?

SubSun

What if you try the code which I suggested? I mean with out any modifications.

E=mc2

ASKER

@Subsun. FGH in the data I provided is fictitious.

If I try your file, and I insert the actual 3 letters of in the actual file, nothing happens, not even a blank file is created.
Then when I add the correct full path to the ouput, a blank file is created.

SubSun

You need to replace the FGH pattern with the pattern where the data ends for each company. The code was working with the input data which you have posted in your question. Im travelling, so I will not be able to test now.

aikimark

@100questions

When you post data, it should be an accurate representation of your production data. If your representative sample deviates too much from your actual data, the experts will post solutions that you will have to tweak in order to solve your actual problem. These question threads are our only means of solving problems at EE and none of the experts is psychic, so the accuracy of your data examples is very important.

E=mc2

ASKER

Seems to work, however I had to change the output file to an actual path of where the start path was, which I also changed.