Solved

Modification of the Powershell script further

Posted on 2014-09-29
11
172 Views
Last Modified: 2014-10-01
This is a continuation of the requests for modifications of the Powershell script.

The unique lines are the following..

ST*850*0001~
BEG*00*SA*4610173073**20140929~
...

PO1**10*CA***UK*12345678912345~
CTP**UCP*10.20~
PID*F*08***Desc 1~
....
the above 3 lines can be repeated over and over..

IEA*1*000001234~

Script to work with:

$InputFile = "C:\Data\original.txt"
$OutputFile = "C:\Data\new.txt"

 Function ParseText ($OutputFile,$InputFile){
 Begin{
             $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
             Set-Content $OutputFile $null
             $Data = Get-Content -Path $InputFile | ?{$_ -match "(BEG\*)|(PO1\*)|(CTP\*)|(PID\*)|(IEA\*)"}
       }
 Process{
       $Data | % {
       #Check the end of data set
             If ($_ -match "IEA\*"){
                   If($poline -ne $null -and $total -ne $null){
                         $Totline = "Total: `$$Total"
                         Write-host "This is Total `$$total"
                         #Write the data collection to output file
                         "$poline$line`r`n$totline`r`n" | out-file $OutputFile -Encoding UTF8 -Append
                         #reset the variables
                         $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
                   }
             }
             #Collect the PO Number
             If($_ -match "(SA\*)(\d{3,10})"){
                   $poline = "PONumber: $($Matches[2])"
             }
             #Collect  G80 details and following G81 Description
             If($_ -match "(PO1\*{1,})(\d{1,})\*\w{2}\*{1,}\w{2}\*(\d{1,})"){
                                     $count = $Matches[2]
                                     $UPC = $Matches[3]
                                     }
                                     If($_ -match "(CTP\*{1,}).*\*(\d{1,}(\.\d{1,})?)"){
                   $i++
                   $total += [double]$count * [double]$Matches[2]
                   $Desc = $Data[([Array]::IndexOf($Data,$_)+1)] -Replace "PID\*\w\*{1,}(\d+\*{1,})?|<NL>"
                   [String[]]$line += "`r`nItem $i`: UPC: $UPC, Desc: $Desc, Qty: $count Price  `$$($Matches[2])"
             }
       }
  }
 }

 If(Test-Path $InputFile){
  ParseText $OutputFile $InputFile
 }

The output desired is similar to this:
PONumber: 235454654
Item 1: UPC: 12345678945612, Desc: Desc 1, Qty: 88 Price  $10.00
Item 2: UPC: 12345678921111, Desc: Desc 2, Qty: 88 Price  $10.50
Item 3: UPC: 12345678955555, Desc: Desc 3, Qty: 88 Price  $12.00
Item 4: UPC: 12345644545455, Desc: Desc 4, Qty: 88 Price  $5.00
Item 5: UPC: 12345689989889, Desc: Desc 6, Qty: 88 Price  $1.00
Total: $100.00

PONumber: 235454100
Item 1: UPC: 12345678945612, Desc: Desc 1, Qty: 88 Price  $10.00
Item 2: UPC: 12345678921111, Desc: Desc 2, Qty: 88 Price  $10.50
Item 3: UPC: 12345678955555, Desc: Desc 3, Qty: 88 Price  $12.00
Item 4: UPC: 12345644545455, Desc: Desc 4, Qty: 88 Price  $5.00
Item 5: UPC: 12345689989889, Desc: Desc 6, Qty: 88 Price  $1.00
Total: $100.00

The script is not working well ....

If there is one PO, and perhaps 5 lines, it will duplicate the first Description, Qty and Price line 5 times, and not reflect the proper data in the original.txt file which corresponds to each line.  The UPC seems to be recognized fine, it's just the rest of the data associated with each line.

What modification would I need to make to the Powershell script for the desired output?
0
Comment
Question by:100questions
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 4
11 Comments
 
LVL 40

Expert Comment

by:footech
ID: 40351076
If this is building off previous posts, you should include a link to the post(s), instead of leaving it for others to search through your previous questions to see what you might be referring to.

There's not enough sample data here to know what's different and where (at least I'm not seeing it).  I'll repeat my earlier advice.
Go through all of your data and work out the variations that can occur.  You can post all the lines that vary here (in fact it'd probably be best).  Group them according what data should be extracted from the line.  For example, here's a couple examples of the PO line (you should note where the PO number is in each case):
BEG*00*SA*1234567**20140917<NL>
G20*N*20140902*98X123456~

At least one sample for each variation would be good.

Variations can include:
 -placement of a desired portion within a line
 -what characters are surrounding the portion
 -how the portion itself can vary, like
    -what kind of characters it can include (e.g. numbers only, letters and numbers, etc.)
    -the length of the portion
 All these things are needed to come up with a regex pattern that will match your data under various circumstances.  Otherwise you're stuck with modifying the regex pattern again and again each time something slightly different occurs.
0
 

Author Comment

by:100questions
ID: 40351106
@subsun - since you were very much involved in previous solutions, I would also welcome assistance from you.
0
 
LVL 40

Expert Comment

by:Subsun
ID: 40352730
To confirm..  PONumber  is 4610173073 from below line..
BEG*00*SA*4610173073**20140929~


UPC: 12345678912345 & Qty: 10 from below line..
PO1**10*CA***UK*12345678912345~

Price  $10.20 from below line..
CTP**UCP*10.20~

Desc: Desc 1 from below line..
PID*F*08***Desc 1~

And the below line comes after each data set..
IEA*1*000001234~

Is that correct?
0
Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

 

Author Comment

by:100questions
ID: 40352776
Yes, thanks for looking at this.   All is good and to be more specific,  If I might add that each data set starts with the line that starts with ISA*00* ....  and ends with a line that starts with IEA*1*...
Hope this helps.
0
 
LVL 40

Expert Comment

by:Subsun
ID: 40352843
I think there wont be a problem with script unless you have duplicate in the input file for CTP line. Do you expect duplicates for CTP line in the input file?

CTP**UCP*10.20~
0
 

Author Comment

by:100questions
ID: 40352993
Yes, it definitely will happen since some products will have the same price.
The price found in the CTP should always correspond to the PO1 information which is before it.
That's probably why there are issues with the output.
Is this something that could be fixed?
0
 
LVL 40

Expert Comment

by:Subsun
ID: 40353373
It can be fixed.. Will there be always a PID line after the CTP line?
0
 

Author Comment

by:100questions
ID: 40353492
Yes, there will, thanks very much.
0
 
LVL 40

Accepted Solution

by:
Subsun earned 500 total points
ID: 40353536
Try..
$InputFile = "C:\Data\original.txt"
$OutputFile = "C:\Data\new.txt"

Function ParseText ($OutputFile,$InputFile){
  Begin{
    $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
    Set-Content $OutputFile $null
    $Data = Get-Content -Path $InputFile | ?{$_ -match "(BEG\*)|(PO1\*)|(CTP\*)|(PID\*)|(IEA\*)"}
  }
  Process{
  $Data | % {
     #Check the end of data set
     If ($_ -match "IEA\*"){
      If($poline -ne $null -and $total -ne $null){
        $Totline = "Total: `$$Total"
        Write-host "This is Total `$$total"
        #Write the data collection to output file
        "$poline$line`r`n$totline`r`n" | out-file $OutputFile -Encoding UTF8 -Append
        #reset the variables
        $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
      }
  }
  #Collect the PO Number
     If($_ -match "(SA\*)(\d{3,10})"){
      $poline = "PONumber: $($Matches[2])"
     }
     #Collect  G80 details and following G81 Description
     If($_ -match "(PO1\*{1,})(\d{1,})\*\w{2}\*{1,}\w{2}\*(\d{1,})"){
      $count = $Matches[2]
      $UPC = $Matches[3]
     }
     If($_ -match "(CTP\*{1,}).*\*(\d{1,}(\.\d{1,})?)"){
     $Price = $Matches[2]
		 $total += [double]$count * [double]$Price
		 }
     If($_ -match "PID\*"){
      $i++
      $Desc = $_ -Replace "PID\*\w\*{1,}(\d+\*{1,})?|<NL>|~"
      [String[]]$line += "`r`nItem $i`: UPC: $UPC, Desc: $Desc, Qty: $count Price  `$$Price"
     }
    }
  }
}

If(Test-Path $InputFile){
  ParseText $OutputFile $InputFile
}

Open in new window

PS : As I explained earlier (also explained by my fellow expert footech), you could come up with the possible variations of the input files so that you could get a single script to parse all types of input file. Else you will end up having separate scripts for each set of input files...
0
 

Author Comment

by:100questions
ID: 40354445
Thanks for the advice.  I only have 1 more left which I've posted after this.  Perhaps if I have a handful in the future I will group the requests together.  For now the separate Powershell files are working well and are serving the purpose for which I need.  
As for this solution, it works very well.  Excellent work.
0
 

Author Closing Comment

by:100questions
ID: 40354446
Excellent work.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Synchronize a new Active Directory domain with an existing Office 365 tenant
This article will help you understand what HashTables are and how to use them in PowerShell.
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question