Modification of the Powershell script further

This is a continuation of the requests for modifications of the Powershell script.

The unique lines are the following..

ST*850*0001~
BEG*00*SA*4610173073**20140929~
...

PO1**10*CA***UK*12345678912345~
CTP**UCP*10.20~
PID*F*08***Desc 1~
....
the above 3 lines can be repeated over and over..

IEA*1*000001234~

Script to work with:

$InputFile = "C:\Data\original.txt"
$OutputFile = "C:\Data\new.txt"

 Function ParseText ($OutputFile,$InputFile){
 Begin{
             $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
             Set-Content $OutputFile $null
             $Data = Get-Content -Path $InputFile | ?{$_ -match "(BEG\*)|(PO1\*)|(CTP\*)|(PID\*)|(IEA\*)"}
       }
 Process{
       $Data | % {
       #Check the end of data set
             If ($_ -match "IEA\*"){
                   If($poline -ne $null -and $total -ne $null){
                         $Totline = "Total: `$$Total"
                         Write-host "This is Total `$$total"
                         #Write the data collection to output file
                         "$poline$line`r`n$totline`r`n" | out-file $OutputFile -Encoding UTF8 -Append
                         #reset the variables
                         $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
                   }
             }
             #Collect the PO Number
             If($_ -match "(SA\*)(\d{3,10})"){
                   $poline = "PONumber: $($Matches[2])"
             }
             #Collect  G80 details and following G81 Description
             If($_ -match "(PO1\*{1,})(\d{1,})\*\w{2}\*{1,}\w{2}\*(\d{1,})"){
                                     $count = $Matches[2]
                                     $UPC = $Matches[3]
                                     }
                                     If($_ -match "(CTP\*{1,}).*\*(\d{1,}(\.\d{1,})?)"){
                   $i++
                   $total += [double]$count * [double]$Matches[2]
                   $Desc = $Data[([Array]::IndexOf($Data,$_)+1)] -Replace "PID\*\w\*{1,}(\d+\*{1,})?|<NL>"
                   [String[]]$line += "`r`nItem $i`: UPC: $UPC, Desc: $Desc, Qty: $count Price  `$$($Matches[2])"
             }
       }
  }
 }

 If(Test-Path $InputFile){
  ParseText $OutputFile $InputFile
 }

The output desired is similar to this:
PONumber: 235454654
Item 1: UPC: 12345678945612, Desc: Desc 1, Qty: 88 Price  $10.00
Item 2: UPC: 12345678921111, Desc: Desc 2, Qty: 88 Price  $10.50
Item 3: UPC: 12345678955555, Desc: Desc 3, Qty: 88 Price  $12.00
Item 4: UPC: 12345644545455, Desc: Desc 4, Qty: 88 Price  $5.00
Item 5: UPC: 12345689989889, Desc: Desc 6, Qty: 88 Price  $1.00
Total: $100.00

PONumber: 235454100
Item 1: UPC: 12345678945612, Desc: Desc 1, Qty: 88 Price  $10.00
Item 2: UPC: 12345678921111, Desc: Desc 2, Qty: 88 Price  $10.50
Item 3: UPC: 12345678955555, Desc: Desc 3, Qty: 88 Price  $12.00
Item 4: UPC: 12345644545455, Desc: Desc 4, Qty: 88 Price  $5.00
Item 5: UPC: 12345689989889, Desc: Desc 6, Qty: 88 Price  $1.00
Total: $100.00

The script is not working well ....

If there is one PO, and perhaps 5 lines, it will duplicate the first Description, Qty and Price line 5 times, and not reflect the proper data in the original.txt file which corresponds to each line.  The UPC seems to be recognized fine, it's just the rest of the data associated with each line.

What modification would I need to make to the Powershell script for the desired output?
100questionsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

footechCommented:
If this is building off previous posts, you should include a link to the post(s), instead of leaving it for others to search through your previous questions to see what you might be referring to.

There's not enough sample data here to know what's different and where (at least I'm not seeing it).  I'll repeat my earlier advice.
Go through all of your data and work out the variations that can occur.  You can post all the lines that vary here (in fact it'd probably be best).  Group them according what data should be extracted from the line.  For example, here's a couple examples of the PO line (you should note where the PO number is in each case):
BEG*00*SA*1234567**20140917<NL>
G20*N*20140902*98X123456~

At least one sample for each variation would be good.

Variations can include:
 -placement of a desired portion within a line
 -what characters are surrounding the portion
 -how the portion itself can vary, like
    -what kind of characters it can include (e.g. numbers only, letters and numbers, etc.)
    -the length of the portion
 All these things are needed to come up with a regex pattern that will match your data under various circumstances.  Otherwise you're stuck with modifying the regex pattern again and again each time something slightly different occurs.
0
100questionsAuthor Commented:
@subsun - since you were very much involved in previous solutions, I would also welcome assistance from you.
0
SubsunCommented:
To confirm..  PONumber  is 4610173073 from below line..
BEG*00*SA*4610173073**20140929~


UPC: 12345678912345 & Qty: 10 from below line..
PO1**10*CA***UK*12345678912345~

Price  $10.20 from below line..
CTP**UCP*10.20~

Desc: Desc 1 from below line..
PID*F*08***Desc 1~

And the below line comes after each data set..
IEA*1*000001234~

Is that correct?
0
Webinar: What were the top threats in Q2 2018?

Every quarter, the WatchGuard Threat Lab releases an Internet Security Report that describes and analyzes the top threat trends impacting companies around the world. Are you ready to learn more about the top threats of Q2 2018? Register for our Sept. 26th webinar to learn more!

100questionsAuthor Commented:
Yes, thanks for looking at this.   All is good and to be more specific,  If I might add that each data set starts with the line that starts with ISA*00* ....  and ends with a line that starts with IEA*1*...
Hope this helps.
0
SubsunCommented:
I think there wont be a problem with script unless you have duplicate in the input file for CTP line. Do you expect duplicates for CTP line in the input file?

CTP**UCP*10.20~
0
100questionsAuthor Commented:
Yes, it definitely will happen since some products will have the same price.
The price found in the CTP should always correspond to the PO1 information which is before it.
That's probably why there are issues with the output.
Is this something that could be fixed?
0
SubsunCommented:
It can be fixed.. Will there be always a PID line after the CTP line?
0
100questionsAuthor Commented:
Yes, there will, thanks very much.
0
SubsunCommented:
Try..
$InputFile = "C:\Data\original.txt"
$OutputFile = "C:\Data\new.txt"

Function ParseText ($OutputFile,$InputFile){
  Begin{
    $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
    Set-Content $OutputFile $null
    $Data = Get-Content -Path $InputFile | ?{$_ -match "(BEG\*)|(PO1\*)|(CTP\*)|(PID\*)|(IEA\*)"}
  }
  Process{
  $Data | % {
     #Check the end of data set
     If ($_ -match "IEA\*"){
      If($poline -ne $null -and $total -ne $null){
        $Totline = "Total: `$$Total"
        Write-host "This is Total `$$total"
        #Write the data collection to output file
        "$poline$line`r`n$totline`r`n" | out-file $OutputFile -Encoding UTF8 -Append
        #reset the variables
        $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
      }
  }
  #Collect the PO Number
     If($_ -match "(SA\*)(\d{3,10})"){
      $poline = "PONumber: $($Matches[2])"
     }
     #Collect  G80 details and following G81 Description
     If($_ -match "(PO1\*{1,})(\d{1,})\*\w{2}\*{1,}\w{2}\*(\d{1,})"){
      $count = $Matches[2]
      $UPC = $Matches[3]
     }
     If($_ -match "(CTP\*{1,}).*\*(\d{1,}(\.\d{1,})?)"){
     $Price = $Matches[2]
		 $total += [double]$count * [double]$Price
		 }
     If($_ -match "PID\*"){
      $i++
      $Desc = $_ -Replace "PID\*\w\*{1,}(\d+\*{1,})?|<NL>|~"
      [String[]]$line += "`r`nItem $i`: UPC: $UPC, Desc: $Desc, Qty: $count Price  `$$Price"
     }
    }
  }
}

If(Test-Path $InputFile){
  ParseText $OutputFile $InputFile
}

Open in new window

PS : As I explained earlier (also explained by my fellow expert footech), you could come up with the possible variations of the input files so that you could get a single script to parse all types of input file. Else you will end up having separate scripts for each set of input files...
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
100questionsAuthor Commented:
Thanks for the advice.  I only have 1 more left which I've posted after this.  Perhaps if I have a handful in the future I will group the requests together.  For now the separate Powershell files are working well and are serving the purpose for which I need.  
As for this solution, it works very well.  Excellent work.
0
100questionsAuthor Commented:
Excellent work.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.